So I’ve started the rewrite of fix_imports2. Here’s what I’ve done so far:
- Condensed the list of py2k module members to only the documented ones. If I removed any important ones, let me know in the comments here.
- Added a few helper functions that will be integral to development.
- Added more tests, though I will be adding more before I end up implementing enough to pass any of the current ones.
The goal of fix_imports2 is to replace the import and usage of a single module with the import and usage of several. This has a bunch of implications. During the process of adding more tests and thinking about all the ways that you can import stuff in Python, I thought of two issues that I would like to share with the rest of the class. I consult The Zen of Python to try to determine which is the “right way” and which is the “wrong way” to proceed, even though the whole module itself breaks one of the tenants (“in the face of ambiguity, refuse the temptation to guess”).
- “from http import server” and “import http.server as something_else” bind to a single name what could turn into multiple modules. fix_imports2 will attempt to disambiguate this. In the best case, this will end up in one of two ways:
1) Intentionally or accidentally, the code only makes use of members of the py3k module provided by a single py2k module, and it is possible to keep that name, or
2) The code makes use of names from the py3k module that are provided by multiple different py2k modules, and it is not possible that the one name imported will be able to be used for all of them.
The right way to fix 2) (“explicit is better than implicit”) is to bind neither module to the original name and replace each usage of the original name with the relevant one.
This turns 1) into an even more special case (“special cases aren’t special enough to break the rules”). I believe that the “right way” to resolve this is to remove the special bindings altogether (“there should be one– and preferably only one –obvious way to do it”).
- importing standard library modules in a class namespace is really, really hard to deal with. Exposing the name “http.server”, or a simple renaming thereof, to methods and to derived classes means the same thing as the last bullet: you’re ascribing to one name what could end up being the amalgamation of multiple names. I’m talking about this:
class A(object):
import http.server
For all modules that fix_imports2 deals with, don’t do this. It will cause errors if and only if (“errors should never pass silently”) a derived class references “http.server” in the A’s namespace, and this could be in another module. Plus, if I did implement in fix_imports2 a fix for this, it would involve going to every class derived from class A and checking every member function, along with every piece of code that uses instances of A or classes derived from A, which sounds prohibitively complicated and error-prone. And again, this would not fix code in other modules that reference A.http.server.
I’ve done a lot of thinking about fix_imports2 (on-and-off for almost a year, and for the past couple of weeks). I keep coming back to one single thought: using “from urllib.request import spam, ham, eggs, …” in the outermost indent level is the best way to ensure that everything will work properly after fix_imports2 is done with it (“simple is better than complex”). That case can be fixed independently from code that make use of a, b, c, and d by a simple pattern and transformation, the way all other fixers are done.
All other cases require extra thinking (both by me and by the fixer) in varying degrees. ”import urllib.request” requires looking for code that uses “urllib.request.something” and giving feedback to guide transformation of “import urllib.request” into something else.* Conditional imports require following a dedent out of the suite but probably will not end up breaking anything, and “from urllib import request” and “import urllib.request as spam” are doable in a similar way as “import urllib.request”, if you don’t mind those different names going away. Importing modules into a class namespace is going to cause problems, and fix_imports2 will probably never handle this solely because of the complexity involved in getting to a solution that still will cause errors.
* Or, it could just replace “import urllib.parse” with “import urllib, urllib2, urlparse” separately from code that uses urllib.parse, and just replace each usage piecemeal, potentially leaving those modules in the namespace unused. Actually, that sounds like a really good idea. Maybe I’ll do that next.
So the main thing to take away from this is that “from urllib.request import a, b, c, d” in outermost indent level is the best way to write code that you want fix_imports2 to handle. And if I am to assume that reading this is a strong indicator that you write code that you want 3to2 to handle, then I imagine that this may actually be a relevant takeaway.