This week hasn’t been a great week as far as coding hours go; I’ve just been obnoxiously distracted. From the compulsion to get Brawl Minus working on Dolphin to playing with my new Nexus One, I haven’t worked the amount of hours that I should be. I’ll try to do better next week.
In spite of that, I did manage to reorganize the test cases in fix_imports2 and even get some of them to pass. I also optimized the pattern to match Python code that needs fixing; this results in a huge speed increase when running fix_imports2, as evidenced when running the tests.
Specifically, fix_imports2 now fixes cases of ”from spam.ham import eggs” (no “as bacon” or multiple comma-separated modules yet) and “from spam.ham import *”. The framework is already there to extend this to the rest of the “from spam.ham import …” cases, which will possibly end up being done today, or maybe not. At this point, it looks like very reasonable to expect “from spam.ham import …” imports to be done by the end of next week.
“import spam.ham” variants along with usage will be the next subtask after that. I think that, from a pragmatic standpoint, a better approach than the one I’ve been considering would be just to change “import urllib.request” to “import urllib2, urllib”, then refactor uses of “urllib.request” on a case-by-case basis.
My previous approach was, upon encountering “import urllib.request”, to find all references to “urllib.request” and refactor them, then import “urllib2″ and/or “urllib” as used. If only names from “urllib2″ were used, then only “urllib2″ would be imported.
This way is overly complex and adds serious mental strain. I originally thought to try it this way because that’s closer to the way that Python logically ends up running such code: it sees “import urllib.request” (or “import urllib.request as billy”) and binds that name in the namespace. A change to that import statement (and the name it binds) will affect everything that references that specific name, and all references must be changed in tandem.
In theory, this is an approach suitable for handling “import urllib.request as subtle”, and I can’t think of a better way to handle that case. However, I’ve already determined that binding a builtin module to a new name adds too much complexity and/or ambiguity (if “import urllib.request as subtle” needs both “urllib” and “urllib2″, which one gets the name “subtle”? should lib3to2 just pick an arbitrary similar [should it even be similar? how similar?] name for the module that does not? why rebind the second one to begin with? follow-up: why rebind the first one if the second one doesn’t have a good reason to be? should it be acceptable to replace this with “import urllib2 as subtle” when only names from “urllib2″ are used, but “import urllib, urllib2″ in other cases? what about third-party usage of the name “subtle”?). So, that reasoning no longer applies. The same argument can be applied to “from urllib import request”, as it is essentially the same as “import urllib.request as request”. In the end, I was so committed to that idea, that I did not take seriously enough the idea of refactoring the import statement and usage separately.
I think I will approach the next part of fix_imports2 (after “from spam.ham import …” cases are done and slightly more thoroughly tested) assuming that “urllib.request.urlopen” actually means the name “urlopen” from the builtin “urllib.request” module that has already been imported, and that “import urllib.request” means to import everything that proviles “urllib.request” functionality, even if it is never referenced later.
With that finished, fix_imports2 should have the level of completion that the other fixers have.