Pages

Monday, October 22, 2012

Python programming - Kaizen way

When programming with Python, I always feel that I have 3 stages.

Stage 1 - python as a script

In this stage, I normally don't know what I don't know. So, I just use python's method to get the functionality I need. Python is superb: it has almost every library I need, it has every container I need, and it means I can code pretty fast.

Stage 2 - python as a procedural language

When I pass the stage one, I can see the methods(functions) that I need to make as I see the duplications clearly. with def, I extract duplicate code into a method.

Stage 3 - python as an OOP language

In this stage, I can see what module/class I need to make. I extract methods into a python class, and I can easily separate the client and python object.

Things to consider

With this three stage python programming model, I need to answer those questions.
  1. What refactoring tools do I need to get the kaizen work effectively?
  2. When do I need to start making testing code?
  3. What method/technique do I need to be more effective? logging the procedure?

Three packages for file utilities in python

os.path

Python provides useful features for checking files and creating files/directories. The small issue is that the packages one need to use the features are not one, but three. For file checking, one can use os.path.exists() method. This can be used for file and directory existence checking.
# Directory check
os.path.exists(DIRECTORY)
# File check 
os.path.exists(FILEPATH)
os.path has more interesting utilities as you can find in this site. The return value of os.path.split is a tuple with 2 elements, and you can get the same result with os.path.dirname and os.path.basename. You should remember python uses basename with is a little bit hard to conjecture. Using os.path.split(), you can extract the directory name and file name.
# Directory
os.path.split(FILE_PATH)[0]
os.path.dirname(FILE_PATH)
# File
os.path.split(FILE_PATH)[1]
os.path.basename(FILE_PATH)
You have another kind of split, which is splitext. It gives you a tuple of root + ext. For example, you'll get ('hello','.txt') from 'hello.txt'.
# Directory
os.path.splitext(FILE_NAME)[0]
os.path.splitext(FILE_NAME)[1]
os.path.abspath() is also useful method that returns absolute path from relative path.
os.path.abspath(RELATIVE_PATH)

os

However for creating directory, one can use os.makedirs() method.
# Directory check
os.makedirs(DIRECTORY)

shutil

Finally, one can use shell utils (shutil) for copying files. You have two choices: copyfile() for specifying source file to dest file, and copy() method for source file to dest file or directory. You can refer to this site for shutil.
shutil.copyfile(FROM_FILE, TO_FILE)
shutil.copy(FROM_FILE, TO_FILE)
shutil.copy(FROM_FILE, TO_DIRECTORY)

Not so OO like features in Python

Python is built on Object Oriented paradigm. However, some of the python's features are not so Object Oriented to confuse OO users. For example, when converting string to intger, python doesn't provide something like "32".to_i(). Compared to Ruby that everything is an object, it looks like less OO way.
# Ruby
x = "102".to_i
# Python
x = int("102"))
Likewise, when you want to get the length, you should call the len() method explicitly.
# Ruby
x = "hello".length
# Python
x = len("hello") 
Sometimes it's annoying.

File save in java and python

Python

Python has really simple and intuitive interface for file saving.
f = open("/Users/smcho/Desktop/hello.txt", "w")
f.write("string")
f.close()
In case you realize that 'try/catch' is missing, you can add that easily.
f = open("hello.txt", "wb")
try:
    f.write("Hello Python!\n")
finally:
    f.close()
Or, you can use the wiz (with as).
with open("hello.txt", "wb") as f:
    f.write("Hello Python!\n")
You can get some hints on python's with as from this post.

Java

Java needs a chain of objects to write string to a file : File object -> FileWriter object (you don't need to specify "w") -> BufferedWriter. Check this post.
File file = new File (canonicalFilename);
BufferedWriter out = new BufferedWriter(new FileWriter(file)); 
out.write(text);
out.close();
This code is using Decorator design pattern, as BufferedWriter() wraps around the FileWriter object, and FileWriter does to the File object. One needs to import some modules, and let the method know it will throw some error. So, the full code should be as follows:
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

class Example {
 public static void main(String[] args) throws IOException {
  File file = new File ("/Users/smcho/Desktop/hello.txt");
  BufferedWriter out = new BufferedWriter(new FileWriter(file)); 
  out.write("Hello, world");
  out.close();
 }
}
You can also process the error processing on your own, as this post shows.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

class Example {
 public static void main(String[] args) {
  try {
   File file = new File ("/Users/smcho/Desktop/hello.txt");
   BufferedWriter out = new BufferedWriter(new FileWriter(file)); 
   out.write("Hello, world");
   out.close();
  }
  catch (IOException e) {
   e.printStackTrace();
  }
 }
}

Source code listing in blogger.com

Using Paste Bin

Pastebin.com seems to be unreliable sometimes. So, I think you'd better be using syntax highlighter if possible. You can use Paste Bin for pasting programming source code. After your pasting code, you'll have embed menu. When clicked, it will give you the javascript so that you can copy and paste. This is the source code to paste from pastebin.com.
<script src="http://pastebin.com/embed_js.php?i=ELanScfL"></script>
And this is the result.

Syntax Highligter

You can use syntax highligher also. The HTML code you need is as follows.







  Pattern p = Pattern.compile("rename_method\\(" + // ignore 'rename_method('
    "\"([^\"]*)\"," +                        // find '"....",'
    "\"([^\"]*)\"," +
    "\"([^\"]*)\""  +
    "\\)");                                  // ignore closing ')'
  Matcher m = p.matcher(refactoring);
  
  boolean result = m.find();
  if (result)
  {
   beforeMethodName = m.group(1);
   afterMethodName = m.group(2);
  }
You'll get this as a result.
  Pattern p = Pattern.compile("rename_method\\(" + // ignore 'rename_method('
    "\"([^\"]*)\"," +                        // find '"....",'
    "\"([^\"]*)\"," +
    "\"([^\"]*)\""  +
    "\\)");                                  // ignore closing ')'
  Matcher m = p.matcher(refactoring);
  
  boolean result = m.find();
  if (result)
  {
   beforeMethodName = m.group(1);
   afterMethodName = m.group(2);
  }
You can also refer to this post.

google-code-prettify

This post teaches about google-code-prettify, and I think it's also simple and useful.
... # Your code goes here

Easier way to embed java script/HTML in blogpost

Refer to this post, with it, you can just use pre tag. Your code now becomes this.

  Pattern p = Pattern.compile("rename_method\\(" + // ignore 'rename_method('
    "\"([^\"]*)\"," +                        // find '"....",'
    "\"([^\"]*)\"," +
    "\"([^\"]*)\""  +
    "\\)");                                  // ignore closing ')'
  Matcher m = p.matcher(refactoring);
  
  boolean result = m.find();
  if (result)
  {
   beforeMethodName = m.group(1);
   afterMethodName = m.group(2);
  }

String pattern matching in Java and Python

I needed to extract substring: the input is
rename_method("%.Arith#add()","%.Arith#addRefactored()","%.Arith")
And from the string I need to extract the substrings in the quotation marks:
%.Arith#add(), %.Arith#addRefactored(), %.Arith
The java code could be implemented as:
  Pattern p = Pattern.compile("rename_method\\(" + // ignore 'rename_method('
    "\"([^\"]*)\"," +                        // find '"....",'
    "\"([^\"]*)\"," +
    "\"([^\"]*)\""  +
    "\\)");                                  // ignore closing ')'
  Matcher m = p.matcher(refactoring);
  
  boolean result = m.find();
  if (result)
  {
   beforeMethodName = m.group(1);
   afterMethodName = m.group(2);
  }
Python uses some interesting features to make the code simpler and easier to read.
matchstring = (
"rename_method\("   # 
  "\"([^\"]*)\","  # %.Arith#add()
  "\"([^\"]*)\","  # %.Arith#addRefactored()
  "\"([^\"]*)\""   # %.Arith
  "\)"
)
inputstring = r'rename_method("%.Arith#add()","%.Arith#addRefactored()","%.Arith")';

m = re.search(matchstring, inputstring)
print m
if m:
 beforeMethodName = m.group(1)
 afterMethodName = m.group(2)
You can use python re module's compile/match, and by doing so, you can remove bunch of parenthesis
p = re.compile(r"""
rename_method\(   # 
  \"([^\"]*)\",   # %.Arith#add()
  \"([^\"]*)\",   # %.Arith#addRefactored()
  \"([^\"]*)\"    # %.Arith
  \)
"""
, re.VERBOSE)

m = p.match(inputstring)
if m:
 beforeMethodName = m.group(1)
 afterMethodName = m.group(2)
For concatenating strings in python with some comment, you can check this post. For re.VERBOSE, check this site. For regular expression in python, check this site, and this one.