Hive Tutorial 32 : Appending new data to existing data

It can be achieved in couple of ways (Purely depends on your requirement)
  1. If you don't bother about overwriting the existing records in the partition, (I mean you don't have a big history data, say 10 yrs data), then Insert Overwrite might fit.
INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;
  1. If you don't bother about duplicates in the partition, then Insert Into might fit (Honestly I wudn't prefer to have duplicate records).
INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement;
  1. If you have history data plus Incremental data, then History data can be inserted once and the incremental data(based on the frequency that you choose daily/weekly/fortnightly basis) can be inserted using a Insert Overwrite

Comments

Popular posts from this blog

Hive Tutorial 31 : Analytic Functions

Hive Tutorial 37 : Performance Tuning