Parka Structures

This section refers to the structures of a Parka Analysis. As we said before, a Parka Analysis contains a lot of information, this Structure Part will help you to understand each kind of structures present on a Parka Analysis.

Everything start with a Parka Analysis as below :

case class ParkaAnalysis(datasetInfo: Both[DatasetInfo], result: ParkaResult)

The left part, datasetInfo, isn’t implemented yet.

The right part, result, contains two type of analysis, an inner and an outer analysis.

The difference between them is simple:

if the Datasets on the right and the Datasets on the left have a row with the same key(s) then this row will be part of the inner analysis.
if there is a row with a particular key only on the right or on the left Datasets then this row will be part of the outer analysis.

Global data structure

Here is the global structure of a Parka Analysis:

datasetInfo: Both[DatasetInfo]
- Not Implemented
resul: ParkaResult
- inner: Inner
  - countRowEqual: Long
  - countRowNotEqual: Long
  - countDeltaByRow: Map[Set[String], DeltaByRow]
  - equalRows: DescribeByRow
  - byColumn: Map[String, Delta]
- outer: Outer
  - both: Both[DescribeByRow]

https://github.com/univalence/spark-tools/blob/master/parka/src/main/scala/io/univalence/parka/Structure.scala

Parka Structures

Global data structure

Related link(s)