使用python filecmp模块的dircmp类可以很方便的比对两个目录,dircmp的用法已经有很多文章介绍,不再赘述。
可以help(filecmp.dircmp)查看帮助信息,其中提到的x.report()、x.report_partial_closure()
,都只能打印两目录一级子目录的比较信息。而x.report_full_closure()
可以递归打印所有子目录的比对信息,但是输出太多,大多数情况下我们可能只关心两目录的不同之处。
help(filecmp.dircmp) 摘选: | High level usage: | x = dircmp(dir1, dir2) | x.report() -> prints a report on the differences between dir1 and dir2 | or | x.report_partial_closure() -> prints report on differences between dir1 | and dir2, and reports on common immediate subdirectories. | x.report_full_closure() -> like report_partial_closure, | but fully recursive.
本文编写的脚本,重点关注并实现两个目标:
1)递归比对两个目录及其所有子目录。
2)仅输出两目录不同之处,包括文件名相同(common_files)但是文件不一致(diff_files),以及左、右目录中独有的文件或子目录。
py脚本compare_dir.py内容如下:
# -*- coding: utf-8 -*- """ @desc 使用filecmp.dircmp递归比对两个目录,输出比对结果以及统计信息。 @author longfeiwlf @date 2020-5-20 """ from filecmp import dircmp import sys # 定义全局变量: number_different_files = 0 # 文件名相同但不一致的文件数 number_left_only = 0 # 左边目录独有的文件或目录数 number_right_only = 0 # 右边目录独有的文件或目录数 def print_diff(dcmp): """递归比对两目录,如果有不同之处,打印出来,同时累加统计计数。""" global number_different_files global number_left_only global number_right_only for name in dcmp.diff_files: print("diff_file found: %s/%s" % (dcmp.left, name)) number_different_files += 1 for name_left in dcmp.left_only: print("left_only found: %s/%s" % (dcmp.left, name_left)) number_left_only += 1 for name_right in dcmp.right_only: print("right_only found: %s/%s" % (dcmp.right, name_right)) number_right_only += 1 for sub_dcmp in dcmp.subdirs.values(): print_diff(sub_dcmp) # 递归比较子目录 if __name__ == '__main__': try: mydcmp = dircmp(sys.argv[1], sys.argv[2]) except IndexError as ie: print(ie) print("使用方法:python compare_dir_cn.py 目录1 目录2") else: print("\n比对结果详情: ") print_diff(mydcmp) if (number_different_files == 0 and number_left_only == 0 and number_right_only == 0): print("\n两个目录完全一致!") else: print("\n比对结果统计:") print("Total Number of different files is: " + str(number_different_files)) print("Total Number of files or directories only in '" + sys.argv[1] + "' is: " + str(number_left_only)) print("Total Number of files or directories only in '" + sys.argv[2] + "' is: " + str(number_right_only))
compare_dir.py脚本使用举例:
总结