是否有HDFS命令来检查HDFS中的2个目录是否具有公共父目录。
例如:
$ hadoop fs -ls -R / user / username / data //用户/用户名/数据/ LIST_1539724717 / SUBLIST_1533057294,/用户/…
for value in `hadoop fs -ls ${DIR}| awk '{print $NF}' | tr '\n' ' '` do if [ "$value" != "items" ]; then #add values into "results" array log "info" "$value" results+=("$value") fi done #Loop through each value inside the array ie " $DIR" for i in "${results[@]}" do oldVal=`hadoop fs -ls -R ${i} | sed 's/ */ /g' | cut -d\ -f 1,8 --output-delimiter=',' | grep ^d | cut -d, -f2` log "info" "Checking sub-directories under $i ! " #This takes the directory name as its input and extract the directories only for the provided runID for val in `hadoop fs -ls -R $i | grep 1539724717 |sed 's/ */ /g' | cut -d\ -f 1,8 --output-delimiter=',' | grep ^d | cut -d, -f2` do if [[ ! ${val} =~ ${oldVal} ]]; then oldVal=$val directory+=("${oldVal}") fi done done
directory array包含所需的所有目录。
directory
通过创建shell脚本,其中目录名称可以作为变量传递,我们可以检查它们是否属于同一个父项。