Joe Amenta's Blog

October 24, 2010

3to2 1.0 Released!

Filed under: 3to2 — AirBreather @ 3:32 am

Having had no particular reason not to, I have released 3to2 version 1.0.

Download it from Bitbucket:
2.7 version here
3.1 version here

PyPI:
3.1 package is called “3to2_py3k
2.7 package is called “3to2

Bug tracker is at http://bitbucket.org/amentajo/lib3to2/issues; submit a bug report if something either goes wrong or doesn’t go right!  (Or you can comment here or contact me directly via some other means)

October 17, 2010

Still planning on more testing for a 1.0 release

Filed under: 3to2,Personal — AirBreather @ 12:55 pm

I’ve made no updates in a while, so (per blogging tradition) here’s the obligatory “I still exist” post:

I’m really busy in school this semester (two classes in particular demand way more work than they should), so all my free time ends up being “me time”.  I’ll eventually get around to testing 3to2 for a 1.0 release hopefully by the end of the year.

August 24, 2010

Plans for lib3to2 post-GSoC 2010

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 5:42 pm

GSoC 2010 is officially over, and I feel that now is a good time to make plans for the future of lib3to2.

I will release “1.0″ when I have fully tested lib3to2 in both Python 3.1 and Python 2.7.  I will also release a version of lib3to2 specifically for Python 2.5/2.6/2.7/3.0/3.1, having already put the current lib2to3 trunk into a PyPI package called “two2three” to make lib3to2 more accessible to users of those versions of Python.  lib2to3, as released in each one of those versions of Python (except 2.5, which didn’t have it) has at least one bug that significantly affects lib3to2′s behavior.  Depending on how this ends up going, I may just end up making this the default, since most instances of Python in use will have the problem for quite a while.

This should hopefully make lib3to2 more widely used and thus generate more feedback.  I will not have the time to devote to testing lib3to2 on every piece of Python 3 code I can find, so I will need this feedback to help figure out where (and if) lib3to2 is deficient, beyond what I’ve already found.

August 11, 2010

Debugging on brain

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 5:55 pm

I’ve spent the better part of today and some of Monday and Tuesday hunting down why so many tests in brain are failing.  While I haven’t completely followed the tracebacks of every single error or failure, I can say that for those that I have, the problem does not appear to be caused by lib3to2, but rather by what I can only explain as differences in the sqlite3 module between the Python3.1 version and the Python2.7 versions.  This suggests that there might still be some work to do with lib3to2.

However, I really don’t see a possible lib3to2-based solution for this.  I wouldn’t even know where to start with this, since sqlite3 is just a thin wrapper around _sqlite3, a C extension module included with Python.  I’m going to tentatively call this one finished, after I fix a bug involving print statements and parentheses.  It shows that even though lib3to2 is pretty good at syntax refactoring, there are some differences (often very subtle) between Python3 and Python2 that lib3to2 will fail to catch and will need manual refactoring (and it’s thanks to brain’s excellent test suite that I found those in the first place!).  Furthermore, lib3to2 is best used as an aid throughout the entire development process, as it will catch incompatibility problems early, before too much code starts building on incompatible code.

I’m also running 3to2′d code with Python 2.7, so there may be some additional problems with Python 2.6 and earlier.

August 7, 2010

Testing on Brain

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 5:02 pm

On PyPI, I’ve run across a substantial third-party module written in Python 3 that is well-tested and demonstrates significant errors with lib3to2.  It’s called brain, and it has 202 test cases.  They all pass in Python 3.1, but less than a quarter of them pass in Python 2.7 after running 3to2 on them:

FAIL: 0 failures, 158 errors, 44 passed

This is a good real-world example of native Python 3 code that doesn’t get properly refactored in with lib3to2, and I’m going to really focus on this.

My new goal is to get that down to 0 errors.  Some failures might be OK: Python 3 and Python2 have semantic differences in some cases, especially regarding the differences Python2 str and Python3 bytes objects, that would justify leaving the code untouched.

System setup:

In order to efficiently test this, I have done the following tricks (on Linux):

  • Downloaded the gzipped tarball and extracted it to /home/joe/brain-0.1.6
  • Made a symbolic link to ../brain in /home/joe/brain-0.1.6/brain/test
  • cd /home/joe/brain-0.1.6/brain/test
  • Run 3to2 with the command 3to2 -j10 -w –no-diffs -fall -xprint -fprintfunction {brain/,internal/,public/,}*.py (-j10: run 10 separate processes to fully utilize my CPU to make this finish a lot sooner, -w: write to file, –no-diffs: don’t give me the diff output, since I’m just running the test anyway, {-fall, -xprint, -fprintfunction}: Run with the normal set of fixers, except use “from __future__ import print_function” instead of refactoring to print statement… for now, and {brain/,internal/,public/,}*.py: bash sorcery to get all the .py files in the current directory and in the public, internal, and brain directories.)
  • Change all unicode literals from run.py into regular string literals (optparse barks at you otherwise) (I have two shell aliases for doing that: alias a=’sed -i s/u\’\”/\’\”/g run.py’; alias b=’sed -i ‘\”s/u”/”/g’\” run.py’… yes, about half an hour of my time was indeed spent just getting the right combination of shell quotes and escapes to play nicely with the regex quotes and escape)
  • Then run python2.7 run.py func
Specific notes:
  • inspect.getfullargspec doesn’t exist in Python 2.  inspect.getargspec is an equivalent that works in both branches.
  • unittest in Python2.7 (unittest2 from PyPI for earlier versions) does not have “assertSameElements”, so any tests in Python 3 that use that will error out.
  • extended iterable unpacking in implicit assignment context still doesn’t exist; I should get on that… is fixed now.
  • optparse in Python 2.7 doesn’t like certain arguments to be unicode objects.  Nothing I can do there; it’s semantics.
  • Oh no!  BLOBs correspond with bytes objects in sqlite3 for Python3, but using str for the same in Python2′s sqlite3 is not exactly compatible; specifically, there are problems with nul chars.  I hope that this issue isn’t too pervasive…
Progress Updates:
  • After adding a fixer for getfullargspec:
    FAIL: 68 failures, 65 errors, 69 passed
  • After manually adding in assertSameElements from before python r79132:
    FAIL: 79 failures, 52 errors, 71 passed
  • After fixing some problems with fix_bytes:
    FAIL: 90 failures, 37 errors, 75 passed
    At this point, almost all the unhandled exceptions (“errors”) raised are raised by brain, with the remainder raised by the underlying library for content-based reasons, so they’re glorified “failures”.
    I like this; it means that most of what’s left is probably semantical errors.

August 5, 2010

Testing on Python 3 standard library: really frustrating

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 5:01 pm

I’ve been devoting much of my time to testing on the Python 3.1 standard library, and let me say that it is really frustrating.  I’ve been trying to make various sufficiently standalone modules pass their associated tests (also run with 3to2), and I’ve been successful in some ways.  In particular, Lib/site.py passes all the tests in Lib/test/test_site.py, which I’m really happy about.  However, in other cases like Lib/io.py, there’s such a strong reliance on the behavior pre-compiled builtin C modules that I can’t possibly get it to work 3to2′d.

And there’s so much that depends on the io module that I am having a really hard time hunting down lib3to2 bugs that exist beyond throwing basic SyntaxErrors.  Also, I’m sure there might be a few bugs in there that are being masked by failures resulting in comparing strings like ‘test output’ with u’test output’, which aren’t technically bugs with lib3to2, but rather with the level of specificity given by the testing code.

So as a result, I’m going to again spend the rest of the GSoC time again looking for well-tested third-party Python 3 code and trying it out with 3to2, since the Py3k standard library is a little bit too complicated a beast for me.

To that end, here’s what I’ve tested, and the results:

httplib2 from PyPI: All tests passed except for 1 error in iri2uri: iterations over bytes objects return an int value in py3k but a str of length 1 in py2k.  The specific error can be fixed by calling ord() on each value returned.
brain from PyPI: Less than a third of all functional test cases pass.

July 31, 2010

Fixer for keyword-only arguments

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 1:50 pm

I have finished the fixer for keyword-only parameters.  This includes things like this:

def spam(arg1, arg2, *, kw_only_arg1, kw_only_arg2): pass

The param list would fix to: (arg1, arg2, **_3to2kwargs), and the body of the function would immediately look for ‘kw_only_arg1′ and ‘kw_only_arg2′ in the _3to2kwargs dict, assign to their respective names, and delete the values from the dict.

Special considerations are made for when the *args and the **kwargs params exist.  Check the test cases for exactly what happens.

Note that this fixer is VERY sensitive to incorrect code, so as always, make sure that code actually works in Python 3 before running it through 3to2.

July 27, 2010

Back into the swing of things

Filed under: 3to2,Google Summer of Code,GSoC 2010,Work — AirBreather @ 2:08 am

I’ve managed to get myself back to coding after an unexpectedly long absence.  I built up a lot of inertia over all that time, so I’m glad to finally overcome it.

I did hit a snag related to my laptop’s Ubuntu installation; this results in me not being able to run the 3to2 tests right now.  I’ll just commit what I have (the recent change probably does what I think it does, but it’s disabled by default so that it doesn’t bother anyone) and work through this tomorrow.  Shouldn’t take more than a few minutes when I’m fully awake, but I’ve learned from experience that I shouldn’t try to do significant sysadmin stuff this late at night if it’s not time-critical.

July 23, 2010

RL just ate my week

Filed under: 3to2,Google Summer of Code,GSoC 2009,GSoC 2010,Personal,Work — AirBreather @ 5:09 pm

This past week, I’ve just had SO many important things going on in real life.  I’m going to start coding again after this weekend (that’s Monday, July 26), when everything should be settled down.

Currently on my plate is a fixer for the extended funcdef parameter syntax.  I’ve already got tests in the repository, just need to work on implementing the fixer.

On a side note, I’m rather proud of how well lib3to2 does with the Py3k standard library.  Most modules I’ve tried just work, and those that fail tests often fail for other reasons than lib3to2 failures.  For example, lib3to2 doesn’t refactor the io module to fully pass its test, because the io module expects the _io module to behave a certain way.

July 14, 2010

Back from Break

Filed under: 3to2,Google Summer of Code,GSoC 2010,Personal,Work — AirBreather @ 12:25 am

I went to Chicago this weekend and just got back tonight.  Back to work tomorrow!

Edit: OK, maybe not today, I need a few days to chill and get my motivation back.

Older Posts »

Powered by WordPress