Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(839)

Issue 21043: Patch to secure Python interpreter (Closed)

Can't Edit
Can't Publish+Mail
Start Review
Created:
15 years, 11 months ago by GvR
Modified:
15 years, 6 months ago
Reviewers:
asktav
CC:
tav
Base URL:
http://svn.python.org/view/*checkout*/python/branches/release25-maint/
Visibility:
Public.

Description

Closing in favor of http://codereview.appspot.com/20051/show

Patch Set 1 #

Total comments: 12

Patch Set 2 : Tav's v2 #

Total comments: 3
Unified diffs Side-by-side diffs Delta from patch set Stats (+189 lines, -58 lines) Patch
M Lib/dis.py View 1 1 chunk +2 lines, -2 lines 0 comments Download
M Lib/doctest.py View 1 2 chunks +2 lines, -2 lines 0 comments Download
M Lib/inspect.py View 1 3 chunks +4 lines, -4 lines 0 comments Download
M Lib/types.py View 1 1 chunk +1 line, -1 line 0 comments Download
M Objects/classobject.c View 6 chunks +3 lines, -28 lines 0 comments Download
M Objects/fileobject.c View 1 chunk +0 lines, -9 lines 1 comment Download
M Objects/funcobject.c View 1 2 chunks +3 lines, -5 lines 0 comments Download
M Objects/genobject.c View 1 1 chunk +1 line, -1 line 0 comments Download
M Objects/methodobject.c View 2 chunks +1 line, -6 lines 2 comments Download
M Objects/typeobject.c View 1 1 chunk +7 lines, -0 lines 0 comments Download
M Python/sysmodule.c View 1 2 chunks +165 lines, -0 lines 0 comments Download

Messages

Total messages: 6
GvR
Hi tav (do you have a real name?), I will review your patch using Rietveld. ...
15 years, 11 months ago (2009-02-20 22:05:47 UTC) #1
GvR
I would prefer to use a different approach, as you can see from my comments ...
15 years, 11 months ago (2009-02-20 22:28:24 UTC) #2
asktav
Hey Guido, > Hi tav (do you have a real name?), http://tav.espians.com/about-tav.html#what-s-your-name > I will ...
15 years, 11 months ago (2009-02-21 02:10:25 UTC) #3
asktav
Hey Guido, I agree with most of your changes, but I'd like to push back ...
15 years, 11 months ago (2009-02-21 03:14:31 UTC) #4
GvR
On Fri, Feb 20, 2009 at 9:58 PM, <codesite-noreply@google.com> wrote: > Comment #13 on issue ...
15 years, 11 months ago (2009-02-21 15:51:42 UTC) #5
asktav
15 years, 11 months ago (2009-02-22 13:00:33 UTC) #6
Dear Guido,

Thank you for such a detailed reply and for not using my full name =)

I had studied rexec and was aware of the use of __builtins__. However, it hadn't
twigged that a function would be operating in the context of the __builtins__
defined in its func_globals, and thus there is no need for the various sys.get_*
functions or the Lib/ patches. Doh!! Sorry for wasting your time and mine with
all that fluff.

And, you are right, it really does solve the problem quite elegantly. It's hard
arguing with someone who has a time machine ;p

If it's any consolation, one of my main intended use of all this is to recreate
somewhat of a bastard child of Grail applets. Python services over the cloud --
running directly on App Engine instances for backend services and occasionally
compiled to "secure javascript" (like Caja does) for use on the frontend.
Hopefully the world is ready for this kind of stuff now.

So, given the new found understanding, the two definitive changes I think we can
both agree on is:

* type.__subclasses__ being restricted
* GeneratorType.gi_frame being restricted

In Python 2.6 and above:

* GeneratorType.gi_code also needs to be restricted

We can also now definitely agree on not needing the:

* sys.get_* functions
* Lib/ patches

Again, doh!!

I think the only question left is: whether to lift the other restrictions or
not? This question splits into two halves -- some "immaterial" RESTRICTIONs and
the restrictions related to class structures.

Some of the restrictions, e.g. __module__ on built-in functions/methods, are
immaterial. And my argument for lifting those restrictions is purely in the
context of simplifying the design. Seeing as they don't really impact security
considerations, why make attributes like func_name, __doc__, __module__, etc.
*RESTRICTED? It just seems to add needless complexity to the design and gives
users something else to watch out for...

As for the class restrictions, I'm not so hot on them. I know that the following
will always work:

  def FileReader(filename, mode='r', buffering=0):

      if mode not in ['r', 'rb', 'rU']:
          mode = 'r'

      file = open(filename, mode, buffering)

      def read(bufsize=-1):
           return file.read(bufsize)

      return Namespace(read=read)

I can pass FileReader to untrusted code without worry because it'll never be
able to get at the real ``file`` type in restricted mode as it doesn't have
access to func_globals (in restricted mode).

You are right in that it's inefficient. There are 2 function calls for the price
of one. But it's a price that, at least personally, I'm willing to pay for
knowing that there isn't some kind of open exploit. Unfortunately, my knowledge
of the Python internals isn't good enough to know that I could have that same
level of security with class-based structures.

One of my first Python projects (back in 2000) was an IRC bot which provided a
.py feature executing arbitrary Python code. It was a very useful feature for
both learning and teaching Python and was built on top of the rexec framework.
It was also an interesting exercise in learning the limitations of rexec -- as
you can imagine it became a sport for everyone to try and overcome the various
security limitations...

And after trying to patch various holes, I eventually gave up hope and retired
the feature from the bot. I simply didn't have enough knowledge of the various
Python internals with regards to classes. And, sadly, I still don't =(

And, things like descriptors -- which are really elegant btw -- sadly just add
more complexity to the issue. There are (at least seemingly) a *lot* of moving
parts that need to be "locked down".

In contrast, the exposure of function internals is limited to the various func_*
attributes and by denying access to them in restricted mode, I find myself much
more confident... in fact, very confident!

And whilst it is inefficient, it is a price worth paying imo. I also came up
with a cute Namespace() constructor that takes functions which are defined
inside functions and returns a pseudo-class instance like structure so that
normal Pythonic idioms can be used.

It's implemented in
http://github.com/tav/plexnet/blob/9dabc570a2499689e773d1af3599a29102071f80/s...

It takes advantage of tuple's immutability and uses dynamic metaclasses to
create the pseudo-class-instance wrapper. And whilst it's only about 1.2x slower
for "method" calls, it's sadly about 46x slower in comparison to the equivalent
class-based approach for instance initialisation.

My previous dabblings with C were actually an attempt to create Namespace as an
extension type:
http://github.com/tav/plexnet/blob/9dabc570a2499689e773d1af3599a29102071f80/s...

This managed to bring down the instantiation overhead to just 3-5x, but also
lost some functionality along the way, e.g. builtin functions like len() don't
treat a __len__ function which getattro returns in the same way as it normally
treats a __len__ class method. So whilst the feature works in the slow pure
Python capbase.py implementation, it didn't in the C extension. And, as you can
see from .capbase.c, my knowledge of C isn't good enough =(

Also, there are interesting developments like:
http://google-caja.googlecode.com/svn/trunk/doc/html/cajitaOptimization/index...
which I think we could possibly adopt for use on Python/AppEngine.

Sorry for rambling =(

The point that I am trying to make is that despite the inefficiencies, I feel
much more confident with security through a function based approach in
restricted mode than I do with class-based restrictions. And, as such, would
petition for the removal of the various class related restrictions.

To me, this would just give others a false sense of security which we cannot
really guarantee without doing a full-on audit of some kind...

However, I can also see that one day it could be possible to be confident enough
of a class-based approach too. And, seeing as the removal of gi_frame and
__subclasses__ enables me to do what I want, perhaps I shouldn't be too greedy
;p

Sorry again for rambling. Please let me know your thoughts and what I should do
for the next version of the patch.

-- 
Many thanks again, tav
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b