Skip to content

Explaining Terms

Tammy Yang edited this page Feb 19, 2020 · 1 revision

depth(peeling)

In the json, every dict will add one depth(peeling). We count depth from 0.

For example,

dummy_dict = {"a": "b", "c":{"aa": "bb", "cc": "d"}},

"a" is at depth 0, "aa" is at depth 1.

Because in normal method, if we want to get "b" or "bb", we should write dummy_dict["a"] or dummy_dict["c"]["aa"]. We have to specify 1 or 2 keys, so the depth is 0 or 1.

top_df, sub_df

Sub_df are those dfs with table name containing the table name of the specifics. For example, "temp__attachments__data", "temp__attachments__data__media" and so on, are sub_dfs of "temp__attachments".

The sub_df can be viewd as one column but recording mutilple value of one df.

Take "dummy_dict" as example, this repo will turn it into,

temp                                                                         
id_0| a |                                                                    
----+---+                                                                    
  0 | b |                                                                    
----+---+                                                                    
                                                                             
temp_c(table name)                                                           
id_0|id_c_1| aa | cc |                                                       
----+------+----+----+                                                       
  0 |  0   | bb | d  |                                                       
----+------+----+----+                                                       

but it can be viewed as

temp                                                                         
id_0| a | c                      |                                           
----+---+------------------------+                                           
  0 | b |{"aa": "bb", "cc": "d"} |                                           
----+---+------------------------+                                           

The "temp_c" is like something growing from "temp", so I call it sub_df.

Top_df refers to the base df when we want to merge sub_df.

Clone this wiki locally