2015年7月2日 星期四

Use Java and Python to write a small tool

 I wrote a tool to parse fail result files of some integration tests. I wrote it in Java and in Python, it turns out using Python is far more easy to write and to read. Bellowing are some comparisons.

 What I want to do:
  1. Read all txt files in a given folder.
  2. Check if every txt files contains a specific string
  3. Parse some information from those files which doesn't contains the specific string.
=========================================================
 Firstly, find all *.txt in a given folder.
  • In Java 7: 
        final Pattern p = Pattern.compile("(.*).txt");
        File[] files = this.listFiles(new FileFilter() {
          @Override      public boolean accept(File file) {
            return p.matcher(file.getName()).matches();
          }
        });
    

  • In Python 2: 
    files = [ f for f in os.listdir(folder_path) if re.match('(.*).txt',f)]
My comments: There are so many verbose code to create a filter to filter file name in Java 7. All I want to do is to type *.txt somewhere and just give me all txt files in the folder. Every code other than the filter (*.txt) are redundant except specifying the folder

=========================================================         
Next, check if a file contains the string "Failures: 0, Errors: 0, Skipped: 0"
My comments: To check if a file contains a specific string, either we check it line by line or we read all text in the file into a string and check if the string contains the target. I really don't care which way I use since the file size is not too big. Python just tells what I want to do in one line. Actually I hope I can type "if not "xxx" in File(path)" to make my code more describable

=========================================================
 Last step, for each file which doesn't contains "Failures: 0, Errors: 0, Skipped: 0", extract all fail test names. Lines contains a fail test name looks like: "testSetStationOnOff(com.live365.api.station.integration.StationTestIT)  Time elapsed: 0.066 sec  <<< FAILURE!", so the format is "[test name]([test class name]) Time elapsed: [seconds] sec <<< FAILURE!", and I just want to extract the [test name] part.

  • In Java 7:     
  //print fail test names  Scanner scanner = new Scanner(report);
  while (scanner.hasNextLine()) {
    String line = scanner.nextLine();

    Pattern pattern = Pattern.compile("(.*)\\((.*)\\)  Time elapsed: (.*) sec  <<< FAILURE!");
    if(pattern.matcher(line).matches()){
      Matcher matcher = pattern.matcher(line);
      matcher.find();
      println("\t"+matcher.group(1));
    }
  }

  • In Python 2:
    #print fail test nameslines = open(report).read();
    results = re.findall('(.*)\((.*)\)  Time elapsed: (.*) sec  <<< FAILURE!',lines)
    for result in results:
        print '\t'+result[0]
My comments:  Yes, what I want to do is to find all strings matches the format! Again in Python my purpose is so obviously shown in two line, but in Java I have to write so many details which I really don't care about. For example, Pattern.compile(...). Why can't I just type line.matches("xxx") and Java handles the rest for me?