Purging/Archiving Partial Data from Very Huge Table



Real challange for DBA’s when they archive table , is when they perform partial data archiving from a very huge table.

This is my case : I was pushed into a bull ring, where i need to purge a workflow table with 42 GB size , which has 512203845 rows

Customer asked me to retain only 6 months .

Initially, as a regular DBAs, i tried to purge through delete statement with nologging and parallel, OMG~!  it took the complete day and additionally it ate all my archivelog space.

then i tried using expdp and impdp using the query option after switching off the archivelog mode, where it goes like this.

expdp myuser/mypass tables=’WORKFLOW_TABLE’ query='”where created_time < sysdate-180″‘

It started well without any issues, but it took more than 13 hours.
Customer will kill me if i am asking for 13 hours downtime.

Finally i worked the very good solution, which took me only 01 hours 30 mins to complete retaining 343888371 records gaining 26GB.

Following is the technical solution:
1.  Extract the metadata of the table which you are going to purge,
2.  Extract the metadata of the index of the table which you are going to purge.
3.  Extract the dependent triggers,grant , synonyms

we can use

select dbms_metadata.get_ddl(‘TABLE’,’WORKFLOW_TABLE’,’MYUSER’) FROM DUAL;

select dbms_metadata.get_ddl(‘INDEX’,’WORKFLOW_TABLE_N1′,’MYUSER’) FROM DUAL;

 select dbms_metadata.get_ddl(‘TRIGGER’,’WORKFLOW_TABLE_TRG1′,’MYUSER’) FROM DUAL;


4. create custom tablespace to accommodate the workflow partial data. [ in my case 30G ]

5. create the table with [as select * from … where 1=2 ] clause , so that it will take the data structure including the constraints associated.

SQL> create table WORKFLOW_TABLE_R1″ as select * from “WORKFLOW_TABLE” where 1=2 ;

6. Move the backup table to the custom tablespace;

SQL> alter table “WORKFLOW_TABLE_R1” move tablespace ev_data_purge ;

Table altered.

7. For safety reason, enforce nologging for workflow table

SQL> alter table “WORKFLOW_TABLE_R1” NOLOGGING ;

Table altered.

8. insert partial data into backup table

SQL> insert into “WORKFLOW_TABLE_R1” select * from “WORKFLOW_TABLE” where created_time < sysdate-180 ;

343888371 rows created.

6. Drop the original table

 
drop table “WORKFLOW_TABLE”;

7. Rename the backup table to original table name

Alter table workflow_TABLE_r1 rename to workflow_TABLE;

8. create indexes for the workflow table.[with the metadata]

9. Create triggers ,grants and synonyms for the workflow table..[with the metadata]

10. Drop tablespace custom tablespace and enforce force logging for database and for the table.

 
 
How is it ??? Post your comments.
 
Yours,
Narasimha Rao.K





 

  • October 1, 2012 | 31 views