Skip to main content

Oracle Text Index -- Real time implementation


Create a context index on a table which has around 100 million records adds upto a size of 120GB. Index creation takes more than 20 days to complete. Along with 100 million records there are around 50 million more records which will be added to this table which makes the table grow to a size of 180GB and with this data index creation is expected to take even more time.

Following steps were being used to create the index.

set timing on time on

exec Ctx_Ddl.Create_Preference('SCB', 'BASIC_WORDLIST');

exec ctx_ddl.set_attribute('SCB', 'wildcard_maxterms',1 5000) ;

exec ctx_ddl.set_attribute('SCB', 'substring_index', 'TRUE') ;

execute CTXSYS.CTX_ADM.SET_PARAMETER ('LOG_DIRECTORY','/oat_lg/rrotmedb/archive/ctx/');


exec ctx_output.add_event(CTX_OUTPUT.EVENT_INDEX_PRINT_ROWID);

drop index NORKOM56.SCB_TRANS_IDX force;

CREATE INDEX NORKOM56.scb_trans_idx ON NORKOM56.scb_all_transactions



PARAMETERS('Sync (on commit) MEMORY 1073741823 wordlist SCBWCP’);


We have made changes to few db parameters and also added few preferences and removed a preference from above list. Steps are as below,

1. Add the following db parameters to pfile,


Bounce the database.

2. Create Preferences as below,

a. EXEC Ctx_Ddl.Create_Preference('USE', 'BASIC_WORDLIST');
EXEC ctx_ddl.set_attribute('USE', 'wildcard_maxterms',15000) ;

b. EXEC ctx_ddl.drop_preference('SCB_LEXER');
EXEC ctx_ddl.create_preference('SCB_LEXER','basic_lexer');
EXEC ctx_ddl.set_attribute('SCB_LEXER','printjoins','-.,&/');

c. EXEC ctx_ddl.create_preference('dimp_USE', 'BASIC_STORAGE');
EXEC ctx_ddl.set_attribute('dimp_USE', 'I_TABLE_CLAUSE','tablespace TME_SGHK_CM_DATA01 STORAGE (INITIAL 10M)');
EXEC ctx_ddl.set_attribute('dimp_USE', 'K_TABLE_CLAUSE','tablespace TME_SGHK_CM_DATA01 STORAGE (INITIAL 10M)');
EXEC ctx_ddl.set_attribute('dimp_USE', 'N_TABLE_CLAUSE','tablespace TME_SGHK_CM_DATA01 STORAGE (INITIAL 10M)');
EXEC ctx_ddl.set_attribute('dimp_USE', 'I_INDEX_CLAUSE','tablespace TME_SGHK_CM_DATA01 STORAGE (INITIAL 10M)');

3. Enable log (Optional) to monitor index creation. Disabling this will give even more performance improvement.

EXEC CTXSYS.CTX_ADM.SET_PARAMETER ('LOG_DIRECTORY','/oat_lg/rrotmedb/archive/mithun/');

4. Create index.


As already explained and demonstrated to your team, There were lots of wait on redo log creation and dbwrite so have increased the parameters on db. You can continue to use this to get better performance not only for index creation but for your day to day processing. Please test these parameters throughly if you want to continue using these parameters in production.

We have removed preference setting substring_index while index creations as this adds up to the bottleneck of any DML operation on the table. Also have introduce and empty stop list, by default index creation was using oracle stop list which creates unnecessary bottleneck while index creation. Also as per your new requirement i have added LEXER properties as well.

With all the changes, index creation for 100 million rows now takes around 10 hours and on 150 million records it takes around 18 hours. These have been tested on 2 test instances which have equivalent hardware resources as production 8 dual core CPU's and 32 GB RAM.

I was able to reduce the time from >20 days to less than 18 hours with 150 million records.

This is still an ongoing issue with search performance, will be posting details about Oracle Text and other performance issues related to Oracle Text shortly.


Anonymous said…
"Create a context index on a table which has around 100 million records adds upto a size of 120GB. Index creation takes more than 20 days to complete"

Are you kidding?

Anyway the information in your site
is useful. Keep up the good work.

Mithun Ashok said…
Hi Boris,

Truth is greater than fiction, indeed it was taking 20 days to complete rather it never reached completion and I am not kidding.

And thanks for your comments.

Popular posts from this blog

Basics of RDBMS

Data Small set of information becomes data, this set of information helps make decision. Data is always some useful information. Database Place where you store the data. Database represents some aspect of the real world called "miniworld". A database is designed, built and populated with data for a specific purpose. It has intended group of users and some preconceived applications in which these users are interested. In other words, a database has some source from which data is derived, some degree of interaction with events in the real world and an audience that is actively interested in the contents of the database. Database can also be defined as collection of one or more tables. Ex: Mobile, human brain etc DBMS (Database Management System ) Is a program that stores retrieves and modifies data in the database on request. Study of different techniques of design, development and maintenance of the database Types of DBMS These types are based upon their m

SQL Interview Questions on Subqueries

SUB Queries: 1. List the employees working in research department 2. List employees who are located in New York and Chicago 3. Display the department name in which ANALYSTS are working 4. Display employees who are reporting to JONES 5. Display all the employees who are reporting to Jones Manager 6. Display all the managers in SALES and ACCOUNTING department 7. Display all the employee names in Research and Sales Department who are having at least 1 person reporting to them 8. Display all employees who do not have any reportees 9. List employees who are having at least 2 reporting 10. List the department names which are having more than 5 employees 11. List department name having at-least 3 salesman 12. List employees from research and accounting having at-least 2 reporting 13. Display second max salary 14. Display 4th max salary 15. Display 5th max salary  -- Answer for nth Max Salary Co-Related Subqueries: 16. Write a query to get 4th max salary from EMP table 17. Wri

Answers for SQL Functions

1. SQL> SELECT empno, ename FROM emp WHERE Length(ename) = 4; 2. SQL> SELECT empno, ename, job FROM emp where Length(job)=7; 3. SQL> SELECT Length('qspiders') - Length(replace('qspiders','s','')) FROM dual; 4. SQL>  SELECT empno, ename, job FROM emp WHERE Instr(job,'MAN') >0; 5. SQL> SELECT empno, ename, job FROM emp WHERE Instr(job, 'MAN') =1; 6. SQL> SELECT empno, ename, job FROM emp WHERE (Length(ename) - Length(Replace(ename, 'L',''))) = 1; 7. SQL> SELECT * FROM dept WHERE Instr(dname,'O') > 0; 8. SQL> SELECT Concat(ename,' working as a ') || Concat(job, ' earns ') || Concat(sal, '  in ') || Conc at('dept ',deptno) AS text from emp; OR SQL> SELECT Concat(Concat(Concat(Concat(Concat(Concat(Concat(ename,' working as a '), job),' earns '), sal),'  in '),'dept '), deptno) AS text FROM emp; 9. SQL