项目作者: sonicfurqan

项目描述 :
Merging and updating csv script
高级语言: Python
项目地址: git://github.com/sonicfurqan/Process-Parse-Merge-CSV.git
创建时间: 2020-02-07T11:59:11Z
项目社区:https://github.com/sonicfurqan/Process-Parse-Merge-CSV

开源协议:

下载


Process - Parse - Merge CSV

  1. Merge CSV
  2. Process CSV for VLOOKUP

Modules Required

  1. Python
  2. Pandas
  3. Numpy
  4. PySimpleGUI
  5. PsUtill

Run Setup to install required dependency

  1. python setup.py

Parameters in paramters.py file

Common

  1. VALUES_TO_BE_REPLACED_BY_NULL
  2. CHUNK
  3. PARENT_FILE_LOCATION
  4. CHILD_FILE_LOCATION
  5. PARENT_PRIMARY_KEY
  6. CHILD_FOREIGN_KEY

    Merge

  7. SOURCE_COLUMN_NAME
  8. PARENT_SOURCE_VALUE
  9. CHILD_SOURCE_VALUE

    VLookup

  10. CHILD_COLUMN_TO_FETCH_DATA
  11. PARENT_COLUMN_TO_PLACE_DATA

Operation in run.py file

Main

  1. MERGE_DATA
  2. REPLACE_DIFFERENCE
  3. VLOOKUP

    Sub

  4. MERGE_HEADERS
  5. CLEAN_EMPTY_COLUMNS
  6. ADD_SOURCE_COLUMN
  7. OUTER_JOIN
  8. OVERRIDE_PARENTDATA

Run Merge Script

For running a script to merge 2 CSV

  1. python run.py

Example Merge

Master Example.csv

Id Name Email
2 Bb
3 C@b.com
4 Db D@b.com
6 Gb
8 F@b.com

Child Example.csv

EXT_Id Name Email
1 Ab a@b.com
2 b@b.com
3 Cb
4 asdf as@df.com
5

CSV after Outer Join

Id Name Email EXT_Id SOURCE
6 Gb Master
2 Bb b@b.com 2 Master/Child
3 Cb C@b.com 3 Master/Child
4 Db D@b.com 4 Master/Child
8 F@b.com Master
Ab a@b.com 1 Child
a@b.com 5 Child

CSV after inner Join with override False

Id Name Email EXT_Id SOURCE
2 Bb b@b.com 2 Master/Child
3 Cb C@b.com 3 Master/Child
4 Db D@b.com 4 Master/Child

CSV after inner Join with override True

Id Name Email EXT_Id SOURCE
2 Bb b@b.com 2 Master/Child
3 Cb C@b.com 3 Master/Child
4 asdf as@df.com 4 Master/Child

Example VLOOKUP

Target Example_new.csv

Id Name
1 a
2 b
3 c
4 d
5 e
f
g
i
j

Lookup Example_old.csv

Old_Id New_id Name
1 100 a
1 1000 aa
2 200 b
3 300 c
4 400 d
5 500 e
6 600 FF
7 700 GG
8 800 HH

Lookup Processed Example_new.csv

Id Name
f
g
i
j
100 a
200 b
300 c
400 d
500 e