Optimize full-text search in Amazon RDS for MySQL and Amazon Aurora MySQL

Developers and database administrators rely on full-text search (FTS) to efficiently search text documents based on relevance. However, maintaining optimal FTS performance in MySQL databases can be challenging. If not managed properly, FTS can become a significant performance bottleneck and may even lead to service outages.

In this post, we show you how to optimize FTS performance in Amazon Relational Database Service (Amazon RDS) for MySQL and Amazon Aurora MySQL-Compatible Edition through proper maintenance and monitoring. We discuss why FTS indexes require regular maintenance, common issues that can arise, and best practices for keeping your FTS-enabled databases running smoothly.

Our post is based on Amazon RDS for MySQL 8.x/Aurora MySQL-Compatible version 3.x and above. Some features and parameters might vary depending on your specific version.

Note: All code examples, SQL statements, stored procedures, and configuration recommendations in this post are provided for demonstration and testing purposes only. Before deploying any of these examples to a production environment, review them with your security team, apply the principle of least privilege, validate all parameter values for your specific workload, and ensure compliance with your organization’s security policies. These examples have not been hardened for production use and may require additional controls depending on your regulatory and compliance requirements (for example, PCI DSS, HIPAA, or SOC 2).

Understanding FTS in MySQL

Full-text search is a special feature in MySQL that helps you find words in text quickly. It works with text-based columns (CHAR, VARCHAR, or TEXT) to enable both natural language and Boolean searching capabilities. FTS uses an inverted index design that stores a list of keywords and, for each word, a list of documents containing that word along with position information.

When you create a full-text index, MySQL generates several auxiliary tables to manage the index data. These include six partitioned index tables (FTS*<id>_INDEX_1 through FTS*<id>_INDEX6) that form the inverted index, plus additional tables for deletion handling (FTS<id>_DELETED, FTS*<id>_BEING_DELETED) and state management (FTS*<id>_CONFIG). For more details, see the InnoDB full-text index documentation. FTS indexes use a unique document identifier (DOC_ID) to map words to the documents where they appear. FTS provides efficient text searching using the MATCH() … AGAINST syntax, so users can perform complex queries across large text collections.

Create FTS index on large table

By default, MySQL uses two threads (configurable via innodb_ft_sort_pll_degree, range 1-16) to create full-text search (FTS) indexes. These threads handle tokenizing text, sorting words, and inserting data into FTS index tables. To speed up FTS index creation on large tables, you can increase the number of parallel threads by setting innodb_ft_sort_pll_degree. This is a static parameter in RDS and Aurora, meaning changes require a full DB instance reboot to take effect (unlike dynamic parameters that apply immediately).

Note: This setting only affects FTS index creation and doesn’t impact regular search operations.

Increasing innodb_sort_buffer_size (default: 1 MB) can also help speed up FTS index creation on large tables. This variable controls the size of in-memory sort buffers used during index build operations. Larger sort buffers reduce the number of disk-based merge passes required, which can significantly improve performance when indexing large text datasets. A value of 64 MB (67108864 bytes) is a reasonable starting point for large table index creation.

Why FTS indexes need regular maintenance

When working with MySQL’s InnoDB storage engine, tables using regular B-tree indexes typically require little ongoing maintenance. However, FTS indexes are different because they require regular maintenance due to their more complex architecture. FTS indexes use inverted lists and document structures to tokenize text into searchable words, and they employ a special cache mechanism to handle the high volume of small insertions that occur during document indexing. Additionally, MySQL doesn’t immediately remove deleted records from FTS indexes-instead, it marks them as deleted in a tracking table (FTS_*_DELETED) and filters them during queries. Over time, these deleted entries and the index cache accumulate, leading to index fragmentation and potential performance degradation. Fragmentation occurs because the auxiliary index tables contain both active entries and deleted entries marked for removal, creating gaps in the data structure. As insertions, updates, and deletes continue, the index tables become increasingly disorganized, with valid data scattered among obsolete entries, forcing queries to scan more data than necessary.

Common issues with FTS

Tables with full-text search (FTS) indexes require regular maintenance to help prevent performance and stability issues.

For example, you might experience the following performance degradation issues:

Semaphore waits and timeouts during high workload periods – Without regular full-text search (FTS) maintenance, database processes queue up while waiting to access FTS resources. As the queue to access the fts_cache_mutex grows larger, wait times increase, leading to timeouts and performance bottlenecks.
Slower DML operations and commits – As the FTS cache grows, DML operations take longer to update the index. What should be quick operations become slow, and commit operations take more time because they need to process through unoptimized FTS data structures.
Degraded search performance – Unmaintained FTS indexes force the query optimizer to scan through fragmented data structures. This makes searches slower because the system needs to process more data to find requested information.

Data integrity issues might also occur:

Duplicate key errors when writing to FTS auxiliary index tables – Neglected FTS maintenance causes the system to generate conflicting entries in auxiliary tables. This results in insert operations failing due to unique key violations, because the system can’t properly maintain index uniqueness constraints.
Inconsistencies in search results – When FTS cache updates aren’t properly synchronized, the system produces inconsistent search results for identical queries run moments apart. The lack of proper synchronization between the FTS cache and main tables creates reliability issues, undermining user confidence in the search functionality.
Corruption in FTS auxiliary tables – System crashes or failed maintenance operations can corrupt FTS auxiliary tables. This corruption leads to unreliable search operations and potential system failures. Without proper maintenance, these corrupted tables can cause cascading issues throughout the database, affecting data integrity and search accuracy.

You might also encounter system stability problems:

Unexpected database crashes or restarts due to FTS cache – A bloated FTS cache strains database server resources. The server struggles to manage the oversized cache, leading to out-of-memory conditions and system crashes. This is particularly problematic during high-load periods when system resources are already limited.
High memory utilization from unoptimized FTS indexes – Unoptimized FTS indexes consume excessive memory resources, leaving insufficient capacity for other critical database operations. The system becomes inefficient, spending more resources on memory management than processing queries. This leads to overall degradation of database performance.
Server unresponsiveness during automatic maintenance – When the database attempts to perform automatic maintenance on neglected FTS indexes, the operations become resource-intensive. These maintenance tasks compete with regular database operations for system resources, resulting in slowdowns and server unresponsiveness.

These issues typically surface during peak traffic periods or soon after the database opens up after a restart – regardless of reason for the restart – making the issues particularly disruptive to business operations. Regular maintenance helps prevent these problems before they impact your production environment.

We can see the following error log example:

mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

/rdsdbbin/oscar/bin/mysqld(fetch_step(que_thr_t*)+0x150) [0x2e75d70]
/rdsdbbin/oscar/bin/mysqld(fts_eval_sql(trx_t*, que_fork_t*)+0x2c) [0x2d1914c]
/rdsdbbin/oscar/bin/mysqld(fts_table_fetch_doc_ids(trx_t*, fts_table_t*, fts_doc_ids_t*)+0xb4) [0x2d13194]
/rdsdbbin/oscar/bin/mysqld(fts_optimize_table(dict_table_t*)+0x778) [0x2d149f8]

That stack trace points to the MySQL process crashing while maintaining the FTS table.

Understanding FTS index maintenance

MySQL stores information about newly indexed words separately from the main search index. This information is combined with the main index in the following scenarios:

OPTIMIZE TABLE is executed
The server is shut down
The cache size exceeds limits defined by innodb_ft_cache_size or innodb_ft_total_cache_size

The third scenario above is particularly problematic because it can occur unexpectedly during peak hours when cache limits are reached, causing performance issues. This is why proactive maintenance scheduling matters.

Best practices for FTS maintenance

In this section, we share best practices for FTS maintenance.

Regular maintenance schedule

Schedule OPTIMIZE TABLE operations during low-traffic periods to prevent unplanned maintenance during peak hours. Rather than setting innodb_optimize_fulltext_only globally (which would disable normal InnoDB optimization for all tables), the stored procedure below sets it at the session level for safer, self-contained execution.

When innodb_optimize_fulltext_only=ON, MySQL processes the FTS index in chunks. Each OPTIMIZE TABLE operation processes only a limited batch of words (default: 2,000) defined by the parameter innodb_ft_num_word_optimize. It means only a subset of words in the FTS index are merged. Multiple sequential operations are required to complete the full index optimization. The system maintains a checkpoint of what has been optimized. The process is designed to minimize impact on ongoing operations.

You can run the following query to find tables with FTS indexes:

SELECT table_name FROM information_schema.statistics WHERE table_schema = 'your_database_name' AND index_type = 'FULLTEXT';

To keep the index maintenance period to a reasonable time, set the innodb_ft_num_word_optimize option to specify how many words to update in the search index. During periods of low workload, run a sequence of OPTIMIZE TABLE statements until the full-text search index is fully updated. This helps avoid lengthy rebuild times by processing a limited number of words per operation.

The following list shows an estimate of how many times you might need to run a maintenance on a certain table depending on the amount of words on the index:

Small data size (100,000 words) – May require 100-200 OPTIMIZE TABLE operations.
Medium data size (1 million words) – May require 500-1000 OPTIMIZE TABLE operations.
Large data size (over 10 million words) – May require 2000-5000 OPTIMIZE TABLE operations.

We recommend testing the behavior in your test environment and identify how many iterations of `OPTIMIZE TABLE` might be required and adjust the value innodb_ft_num_word_optimize accordingly.

Complete the following steps to implement regular maintenance with FTS tables on a schedule in Amazon RDS or Aurora:

Create a maintenance procedure (change max_iterations by your data size and your_FTS_table name):
The database user creating this procedure requires the CREATE ROUTINE privilege on the target database.
The user executing this procedure requires the EXECUTE privilege on the procedure and the ALTER privilege (required for OPTIMIZE TABLE) on the target table.
Grant privileges using the principle of least privilege, do not use a superuser or administrative account for routine maintenance tasks. For example:

-- Create a dedicated maintenance
CREATE USER 'fts_maintenance'@'<host>' IDENTIFIED BY '<strong-password>';
GRANT CREATE ROUTINE ON your_database.* TO 'fts_maintenance'@'<host>';
GRANT EXECUTE ON PROCEDURE your_database.OptimizeFTSIndex TO 'fts_maintenance'@'<host>';
GRANT ALTER ON your_database.your_FTS_table TO 'fts_maintenance'@'<host>';

Replace <strong-password> with a strong, randomly generated password managed through AWS Secrets Manager. Do not hardcode credentials in application code or scripts.
Replace <host> with your IPv4 (e.g.,’10.0.0.%’) or IPv6 (e.g.,’fd00::1′) address
The DEFINER clause on the stored procedure controls the security context in which it executes. Using SQL SECURITY INVOKER (shown below) is the more secure choice, as it executes with the calling user’s privileges rather than the procedure creator’s privileges, limiting potential privilege escalation.
Review and test this procedure in a non-production environment before deploying to production.

Change max_iterations based on your data size and replace your_FTS_table with your actual table name:

DELIMITER //
CREATE PROCEDURE OptimizeFTSIndex();

BEGIN
DECLARE optimizationcomplete INT DEFAULT 0;
DECLARE iterations INT DEFAULT 0;
DECLARE max_iterations INT DEFAULT _100; -- Adjust based on your data size

SET session innodb_optimize_fulltext_only=on;

optimization_loop: LOOP
-- Run optimization
OPTIMIZE TABLE your_FTS_table;

-- Increment counter
SET iterations = iterations + 1;

-- Check if optimization is complete (you'll need to implement your own completion check)
-- This could be based on monitoring the size of auxiliary tables
-- or checking specific metrics

-- Exit conditions
IF optimization_complete = 1 OR iterations >= max_iterations THEN
LEAVE optimization_loop;
END IF;

-- Add delay between iterations to reduce server load
DO SLEEP(10);
END LOOP;
END //

DELIMITER ;

Configure scheduling in Amazon RDS or Aurora:
The user creating the event requires the EVENT privilege on the target database. Grant this privilege explicitly rather than using a broad administrative role:

GRANT EVENT ON your_database.* TO 'fts_maintenance'@'<host>';

The account that owns (defines) the event must also have EXECUTE privilege on the OptimizeFTSIndex procedure. The event runs with the privileges of its definer at the time of creation, so avoid defining events with a high-privilege account.
Enable the event scheduler only on instances where it is required. Verify the scheduler status with:

SHOW VARIABLES LIKE 'event_scheduler';

In RDS and Aurora, you can enable the event scheduler by setting event_scheduler = ON in your DB parameter group. Disable it on instances where scheduled events are not needed.
Use a dedicated, low-privilege database user to own maintenance events rather than sharing accounts across applications.
Periodically audit all scheduled events to verify they remain operational and have not been modified unexpectedly:

SELECT EVENT_SCHEMA, EVENT_NAME, DEFINER, STATUS, LAST_EXECUTED
FROM information_schema.EVENTS;

-- Create an event to run during off-peak hours
CREATE EVENT optimize_fts_index ON SCHEDULE EVERY 1 DAY
STARTS (TIMESTAMPADD(DAY, 1, DATE_FORMAT(NOW(), '%Y-%m-%d 02:00:00')))
DO CALL OptimizeFTSIndex();

Cache size configuration

Properly configure the following FTS cache parameters:

innodb_ft_cache_size – Controls per-table cache size. Index inserts and updates are only committed to disk when the innodb_ft_cache_size size limit is reached. For heavy write workloads, the recommended innodb_ft_cache_size is 32-64 MB (the default is 8 MB). Heavy write workloads include frequent INSERT or UPDATE operations on FTS-indexed columns, batch imports, real-time data ingestion, or high-volume content management systems. This range provides sufficient buffering to batch index updates efficiently, reducing disk I/O compared to the default setting. The larger cache allows more index changes to accumulate in memory before being flushed, which is particularly beneficial during bulk loading operations or periods of intensive write activity.
innodb_ft_total_cache_size – Controls total cache size across tables (not available in Aurora MySQL-Compatible due to Aurora’s distributed storage architecture, which handles caching differently). For more details, see the RDS parameter group documentation. For environments with many FTS tables-such as multi-tenant applications, distributed search systems, or platforms with multiple content types requiring full-text search-the recommended innodb_ft_total_cache_size is 1-2 GB. This ensures adequate memory allocation across all tables while preventing any single table from monopolizing available cache. When you have 10 or more tables with full-text indexes, a total cache of 2 GB allows each table to maintain a reasonable working set in memory.

Before implementing these configuration changes in production, thorough testing is essential. Establish baseline performance metrics in your test environment by measuring INSERT, UPDATE, and SEARCH operation times with default settings. Then apply the recommended cache sizes and repeat the same tests, comparing execution times and resource usage. Run tests for at least 7 days to capture daily and weekly usage patterns. Monitor not just query performance but also memory consumption to ensure the increased cache sizes don’t cause out-of-memory conditions on your server.

Key metrics to monitor and warning signs

Monitor the following metrics:

Monitor the status variable Aurora_fts_cache_memory_used to check the FTS memory usage (available from Aurora MySQL version 3.07.0 and higher, see Aurora MySQL 3.07.0 release notes)
For a growing number of pending DML operations on tables with FTS indexes, monitor using the DMLLatency metric (Aurora MySQL) and SlowQuery log
Monitor for a high RollbackSegmentHistoryListLength (Aurora MySQL CloudWatch metric; for RDS MySQL, check SHOW ENGINE INNODB STATUS or information_schema.innodb_metrics)
Check the size of FTS tables before and after each run OPTIMIZE TABLE:

-- Check FTS presence and optimization need

SELECT
t.TABLE_SCHEMA AS 'Database',
t.TABLE_NAME AS 'Table',
-- Check if has FTS
CASE
WHEN COUNT(DISTINCT s.INDEX_NAME) > 0 THEN 'YES'
ELSE 'NO'
END AS 'Has FTS?',
COUNT(DISTINCT s.INDEX_NAME) AS 'FTS Count',
GROUP_CONCAT(DISTINCT s.INDEX_NAME ORDER BY s.INDEX_NAME) AS 'FTS Indexes',
-- Table size info
FORMAT(t.TABLE_ROWS, 0) AS 'Rows',
CONCAT(ROUND(t.DATA_LENGTH / 1024 / 1024, 2), ' MB') AS 'Data',
CONCAT(ROUND(t.INDEX_LENGTH / 1024 / 1024, 2), ' MB') AS 'Index',
CONCAT(ROUND(t.DATA_FREE / 1024 / 1024, 2), ' MB') AS 'Free',
CONCAT(ROUND((t.DATA_FREE / (t.DATA_LENGTH + t.INDEX_LENGTH + t.DATA_FREE)) * 100, 2), '%') AS 'Frag%',
-- Optimization recommendation
CASE
WHEN COUNT(DISTINCT s.INDEX_NAME) = 0 THEN 'N/A - No FTS'
WHEN (t.DATA_FREE / (t.DATA_LENGTH + t.INDEX_LENGTH + t.DATA_FREE)) * 100 > 25
OR t.DATA_FREE / 1024 / 1024 > 100 THEN 'CRITICAL - Optimize Now'
WHEN (t.DATA_FREE / (t.DATA_LENGTH + t.INDEX_LENGTH + t.DATA_FREE)) * 100 > 15
OR t.DATA_FREE / 1024 / 1024 > 50 THEN 'WARNING - Optimize Soon'
WHEN (t.DATA_FREE / (t.DATA_LENGTH + t.INDEX_LENGTH + t.DATA_FREE)) * 100 > 10 THEN 'MONITOR'
ELSE 'HEALTHY'
END AS 'Status',
CASE
WHEN COUNT(DISTINCT s.INDEX_NAME) > 0
AND ((t.DATA_FREE / (t.DATA_LENGTH + t.INDEX_LENGTH + t.DATA_FREE)) * 100 > 20
OR t.DATA_FREE / 1024 / 1024 > 100)
THEN CONCAT('OPTIMIZE TABLE ', t.TABLE_NAME, ';')
ELSE 'No action needed'
END AS 'Action'
FROM INFORMATION_SCHEMA.TABLES t
LEFT JOIN INFORMATION_SCHEMA.STATISTICS s
ON t.TABLE_SCHEMA = s.TABLE_SCHEMA
AND t.TABLE_NAME = s.TABLE_NAME
AND s.INDEX_TYPE = 'FULLTEXT'
WHERE t.TABLE_SCHEMA = 'demo_fts'
AND t.TABLE_NAME = 'articles'
GROUP BY t.TABLE_SCHEMA, t.TABLE_NAME, t.TABLE_ROWS,
t.DATA_LENGTH, t.INDEX_LENGTH, t.DATA_FREE;

Example output:

When the Frag% approaches or exceeds critical thresholds (25% or 100 MB free space), take immediate action by increasing the frequency of your scheduled OPTIMIZE TABLE operations. For tables showing CRITICAL status, consider running optimization more frequently (for example, twice daily instead of daily) until fragmentation returns to healthy levels. If fragmentation persists despite regular optimization, you may need to increase the innodb_ft_num_word_optimize value to process more words per operation or evaluate whether your write workload requires larger cache size settings.

Monitor FTS memory usage metrics during optimization:

SELECT * FROM performance_schema.memory_summary_global_by_event_name WHERE event_name LIKE '%fts%';

Emergency measures

If issues arise, consider the following immediate actions:

Temporarily stop writes to the table that has FTS index
Set innodb_optimize_fulltext_only = on in session if using default RDS parameter group, then run OPTIMIZE TABLE to rebuild FTS indexes
If you need to recreate an FTS index, do it during maintenance windows
Raise a technical support case to AWS Support if performing the preceding actions didn’t resolve the issue

Clean up

If you created test tables or procedures while following this post, you can remove them using the following SQL commands:

DROP TABLE IF EXISTS your_test_table;
DROP PROCEDURE IF EXISTS OptimizeFTSIndex;
DROP EVENT IF EXISTS optimize_fts_index;
DROP USER IF EXISTS 'fts_maintenance'@'<host>';

Conclusion

Regular FTS maintenance based on your workload is crucial for maintaining database performance and stability. We recommend implementing a proactive maintenance strategy during off-peak hours rather than waiting for issues to arise during critical business periods. Consistent index maintenance helps prevent duplicate key errors in auxiliary index tables, which can lead to high commit latency, lock contentions, and even database crashes. By scheduling routine OPTIMIZE TABLE operations or index rebuilds, you can achieve optimal FTS performance and help avoid emergency interventions during peak business hours.

For more information about managing full-text search in MySQL, see the MySQL documentation on InnoDB full-text indexes, and to learn more about optimizing Amazon RDS and Aurora databases, visit the Amazon RDS User Guide.

AWS Database Blog

Optimize full-text search in Amazon RDS for MySQL and Amazon Aurora MySQL

Understanding FTS in MySQL

Create FTS index on large table

Why FTS indexes need regular maintenance

Common issues with FTS

Understanding FTS index maintenance

Best practices for FTS maintenance

Regular maintenance schedule

Cache size configuration

Key metrics to monitor and warning signs

Emergency measures

Clean up

Conclusion

About the authors

Resources

Blog Topics

Follow

Learn

Resources

Developers

Help