A group of researchers which uses Text-to SQL models to create malicious code.
This malicious code has the most amazing ability to steal sensitive information, launch and attack its targets.
A growing number of database apps use methods to communicate better with their users. This involves translating user questions into SQL queries to deliver a better experience.
Breaking Databases via Text-to-SQL Attacks
Crackers are able to manipulate Text-toSQL models using specially-designed questions in order to create malicious code. Because such code runs automatically on the database, it is very likely that serious consequences can be expected.
This study appears to be the first empirically valid example of NLP modeling as an attack vector. It was also validated against commercial products during the course:
- BAIDU-UNIT
- AI2sql
A similarity to is when malicious payloads are transferred into the . This can lead to unexpected results if the malicious payload has been embedded in an input question.
You could weaponize the malicious SQL queries that can be infected by payloads specially designed. An attacker could execute these SQL queries against the server to alter the database. Additionally, he/she could also carry out an .
A second threat category looked at the possibility that several PLMs could be compromised so malicious commands might be generated when spurs are triggered.
The training samples can be poisoned in many ways to infiltrate a PLM-based system. These can then be used as backdoors.
Four open-source projects were targeted by backdoors. Here are the details:
- BART-BASE
- BART-LARGE
- T5-BASE
- T5-3B
The use of corpus contaminated with maligning samples had a success rate of 100%. However, there were no significant effects on the performance. These issues are difficult to detect in real life.
Researchers stated, “Moreover,” that experiments using four open-source frameworks proved that backdoor attacks could achieve a success rate of 100% on Text-to-SQL systems. They also showed that they had almost no impact on prediction performance.
Recommendations
Researchers suggested that these mitigations might be used:
- To detect unusual strings, integrate classifiers in the program’s inputs
- Off-the-shelf models need to be evaluated in order to avoid threats to the supply-chain.
- It is vital to adopt effective software engineering methods
- Automation tools are important for the process of automation.
- It is best to act immediately.
Secure Web Gateway, Web Filter Rules Activity Tracking and Malware Protection.