Data Integrator (Python API)
Public Member Functions | List of all members
cls.TableJoin.CTableJoin Class Reference

Class for joining two tables, each of them may not be sorted. More...

Inheritance diagram for cls.TableJoin.CTableJoin:
Inheritance graph
[legend]
Collaboration diagram for cls.TableJoin.CTableJoin:
Collaboration graph
[legend]

Public Member Functions

def __init__ (s)
 
def HaveHeader (s, flag)
 Indicate if the files do have header data. More...
 
def SetPermissiveness (s, p)
 Set the permissiveness model if invoked as a command line tool. More...
 
def SetOutputFile (s, outFFN)
 Set the output file name. More...
 
def SetJoinFile (s, ffn, jCol, tCol=[])
 Set and read the file to be joined to the base file. More...
 
def GetExitCode (s)
 Retrieve exit code for cmd line invocation according to permissiveness model. More...
 
def Join (s, inFFN, jCol, noData=CELL_NO_DATA, idColEntry="", idColName="Source", joinUnpairable=False, summary=False, headerSuffix="")
 Join data to the base file. More...
 

Detailed Description

Class for joining two tables, each of them may not be sorted.

Join two unsorted tables almost like the classic 'join'
command. Additionally, a source tag column can be added, identifying the
file where data was joined from. Only a subset of all columns may be
transferred to the joined file. We slightly deviate from the 'join'
command when it comes to equalness of files. There is a base file which
data is added to from a join file. As the join file needs to reside in
memory, we recommend to choose the smaller of the two files for this
purpose.

Constructor & Destructor Documentation

◆ __init__()

def cls.TableJoin.CTableJoin.__init__ (   s)

Member Function Documentation

◆ GetExitCode()

def cls.TableJoin.CTableJoin.GetExitCode (   s)

Retrieve exit code for cmd line invocation according to permissiveness model.

Returns
int.

◆ HaveHeader()

def cls.TableJoin.CTableJoin.HaveHeader (   s,
  flag 
)

Indicate if the files do have header data.

    @param flag  @c True if there are headers, else @c False

◆ Join()

def cls.TableJoin.CTableJoin.Join (   s,
  inFFN,
  jCol,
  noData = CELL_NO_DATA,
  idColEntry = "",
  idColName = "Source",
  joinUnpairable = False,
  summary = False,
  headerSuffix = "" 
)

Join data to the base file.

    Previously, the to be joined file has been set and read. In this call,
    we use its data to join with the base file given by parameter @c
    inFFN. It is possible to add a column which identifies the join data
    set. Unpaired lines from the base file can be printed, too.
    @param inFFN  Full path file name for base file, '-' for stdin.
    @param jCol  Column index with join key (counted from 0).
    @param noData  Empty cell specifier. Defaults to that of the system.
    @param idColEntry  A @c String which specifies the origin of the join
     file. Helpful when joining vertically, that is, multiple files with
     like data are joined in several independent steps and the final
     output file is the concatenation of all of them.
    @param idColName  Column header name for the optional data origin
     column.
    @param joinUnpairable  Prints unpaired lines from the base file.
    @param summary  Do not pair but print the number of lines that would
     result upon pairing. Helpful when joining large files into even
     larger files with a lot of overlaps.
    @param headerSuffix  Add this string to each header column name (for
     easier distinguishing in multiple joins).
    @return @c True if all lines were output and joining was successful.
     @c False if an error occurred, ie. the input file could not be
     opened or was empty. The exit code has been set accordingly.

◆ SetJoinFile()

def cls.TableJoin.CTableJoin.SetJoinFile (   s,
  ffn,
  jCol,
  tCol = [] 
)

Set and read the file to be joined to the base file.

    The method is very forgiving when it comes to missing columns or
    incorrect indexing. It implements the permissiveness model and still
    tries to continue when encoutering invalid input lines. Even a header
    line can be invalid and the program still tries to continue. In this
    case, the header however, will be erased and filled with 'N/A' values.
    @param ffn  Full path file name of file to join.
    @param jCol  Column number used for joining the files. Numbering
     starts from 0.
    @param tCol  [optional] Add only these columns to the base
     file. Numbering starts with 0.
    @return @c True if file has successfully been read. @c False if the
     file could not be read.

◆ SetOutputFile()

def cls.TableJoin.CTableJoin.SetOutputFile (   s,
  outFFN 
)

Set the output file name.

    @param outFFN  Full path file name for base file, '-' for stdout.

◆ SetPermissiveness()

def cls.TableJoin.CTableJoin.SetPermissiveness (   s,
  p 
)

Set the permissiveness model if invoked as a command line tool.

    @param p  Permissiveness model @c string. One of
    @c PERMISSIVENESS_ECHO, @c PERMISSIVENESS_SKIP, @c PERMISSIVENESS_STOP
    or @c None to disable application of the model.

The documentation for this class was generated from the following file: