Purging/Archiving Partial Data from Very Huge Table
Real challange for DBA’s when they archive table , is when they perform partial data archiving from a very huge table.
This is my case : I was pushed into a bull ring, where i need to purge a workflow table with 42 GB size , which has 512203845 rows
Customer asked me to retain only 6 months .
Initially, as a regular DBAs, i tried to purge through delete statement with nologging and parallel, OMG~! it took the complete day and additionally it ate all my archivelog space.
then i tried using expdp and impdp using the query option after switching off the archivelog mode, where it goes like this.
expdp myuser/mypass tables=’WORKFLOW_TABLE’ query='”where created_time < sysdate-180″‘
It started well without any issues, but it took more than 13 hours.
Customer will kill me if i am asking for 13 hours downtime.
Finally i worked the very good solution, which took me only 01 hours 30 mins to complete retaining 343888371 records gaining 26GB.
Following is the technical solution:
1. Extract the metadata of the table which you are going to purge,
2. Extract the metadata of the index of the table which you are going to purge.
3. Extract the dependent triggers,grant , synonyms
we can use
select dbms_metadata.get_ddl(‘TABLE’,’WORKFLOW_TABLE’,’MYUSER’) FROM DUAL;
select dbms_metadata.get_ddl(‘INDEX’,’WORKFLOW_TABLE_N1′,’MYUSER’) FROM DUAL;
select dbms_metadata.get_ddl(‘TRIGGER’,’WORKFLOW_TABLE_TRG1′,’MYUSER’) FROM DUAL;
4. create custom tablespace to accommodate the workflow partial data. [ in my case 30G ]
5. create the table with [as select * from … where 1=2 ] clause , so that it will take the data structure including the constraints associated.