the below query it scans more rows while the table has an index on it but not using that index for that column.
Query;
SELECT
*
FROM
st_aepsrequest_log
WHERE
`snd_transno` IN (
SELECT
pwcashout_transno
FROM
st_aeps_transaction_master a
WHERE
a.`entry_date` >= '2022-09:29 13:00:00'
AND a.entry_date <= '2022-09-29 13:30:00'
)
row scans;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: st_aepsrequest_log
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 7355201
filtered: 100.00
Extra: NULL
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: a
partitions: NULL
type: eq_ref
possible_keys: snd_unique,pwaeps_transno,entry_date
key: snd_unique
key_len: 92
ref: func
rows: 1
filtered: 5.00
Extra: Using index condition; Using where
table structure;
*************************** 1. row ***************************
Table: st_aepsrequest_log
Create Table: CREATE TABLE `st_aepsrequest_log` (
`serno` int(11) NOT NULL AUTO_INCREMENT,
`db_serno` char(2) NOT NULL DEFAULT '',
`brand` char(2) NOT NULL DEFAULT '',
`counter_code` char(20) NOT NULL DEFAULT '',
`transno` char(30) NOT NULL DEFAULT '',
`snd_transno` char(30) NOT NULL DEFAULT '',
`latlog` char(100) NOT NULL DEFAULT '',
`trans_mode` char(20) NOT NULL DEFAULT '',
`amount` double NOT NULL DEFAULT '0',
`intime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`outtime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`user_agent` text NOT NULL,
`remote_ip` text NOT NULL,
`request` text NOT NULL,
`response` text NOT NULL,
`url` text NOT NULL,
PRIMARY KEY (`serno`),
KEY `sndtransno` (`snd_transno`)
) ENGINE=InnoDB AUTO_INCREMENT=16912804 DEFAULT CHARSET=latin1
*************************** 1. row ***************************
Table: st_aeps_transaction_master
Create Table: CREATE TABLE `st_aeps_transaction_master` (
`serno` bigint(20) NOT NULL AUTO_INCREMENT,
`entry_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`mode_id` varchar(20) NOT NULL DEFAULT '',
`db_serno` char(2) NOT NULL DEFAULT '',
`merchant_id` int(10) unsigned NOT NULL DEFAULT '0',
`service_group_id` int(10) unsigned NOT NULL,
`merchant_channel` enum('RETAIL','B2C') DEFAULT NULL,
`merchant_users_id` varchar(20) NOT NULL,
`provider_user_id` varchar(100) NOT NULL DEFAULT '',
`merchant_transno` varchar(50) DEFAULT '',
`pwcashout_transno` varchar(30) NOT NULL,
`rrn` varchar(100) DEFAULT NULL,
`pw_stan` varchar(50) NOT NULL DEFAULT '',
`provider_id` int(10) unsigned NOT NULL DEFAULT '0',
`service_id` int(10) unsigned NOT NULL,
`bank_id` int(11) NOT NULL DEFAULT '0',
`amount` double DEFAULT '0',
`total_comm` double NOT NULL DEFAULT '0',
`provider_comm` double NOT NULL DEFAULT '0',
`gst` enum('INCLUSIVE','EXCLUSIVE') NOT NULL DEFAULT 'INCLUSIVE',
`gst_value` double NOT NULL DEFAULT '0',
`aadhar_no` char(12) NOT NULL DEFAULT '',
`aeps_identifier` char(50) NOT NULL DEFAULT '',
`provider_rate_mode` enum('PERCENT','AMOUNT','CUSTOM') DEFAULT 'PERCENT',
`device_info` varchar(200) NOT NULL DEFAULT '',
`device` varchar(20) DEFAULT NULL,
`device_serno` varchar(20) NOT NULL DEFAULT '',
`client_ip` varchar(20) NOT NULL DEFAULT '',
`refund_date` datetime DEFAULT '0000-00-00 00:00:00',
`requery_date` datetime DEFAULT '0000-00-00 00:00:00',
`provider_response_message` text,
`provider_response_code` varchar(20) NOT NULL DEFAULT '',
`response` text NOT NULL,
`trans_settle_date` date NOT NULL DEFAULT '0000-00-00',
`trans_settle_datetime` datetime DEFAULT '0000-00-00 00:00:00',
`trans_settle_status` char(1) NOT NULL DEFAULT 'N',
`status` enum('INITIATED','SUCCESS','FAILED') NOT NULL,
PRIMARY KEY (`serno`),
UNIQUE KEY `snd_unique` (`pwcashout_transno`),
KEY `pwaeps_transno` (`pwcashout_transno`),
KEY `merchant_transno` (`merchant_transno`),
KEY `entry_date` (`entry_date`),
KEY `provider_id` (`provider_id`),
KEY `trans_settle_datetime` (`trans_settle_datetime`),
KEY `idx_ent_merc` (`merchant_users_id`,`entry_date`)
) ENGINE=InnoDB AUTO_INCREMENT=87032220 DEFAULT CHARSET=utf8
is there any way to optimize the above query?
i have added the format of the databases
dfsdfdsfdsfdsfdsfdsfdsfdsfsdfsdfdsfdsfdsf fdshfjsdhjkfhkjdshkfhsjkdfhkdfjhdsjhfgdhjgfjhdjfhgdshjfgjdsf
CodePudding user response:
Character sets and collations are baked into indexes on columns with data types like the CHAR(30)
you use for st_aepsrequest_log.snd_transno
and st_aeps_transaction_master.pwcashout_transno
. So, like @BillKarwin mentioned, if the character sets and collations vary it defeats the use of indexes.
Now, it looks like your subquery SELECT pwcashout_transno ...
produces a modest number of rows in its result set. And, the character set for st_aepsrequest_log.snd_transno
is latin1. So if you convert the output of the subquery to latin1, it should be possible for your IN()
clause to use the index on that column. SELECT CONVERT(pwcashout_transno USING latin1)
should do the trick. Try this version of your query:
SELECT
*
FROM
st_aepsrequest_log
WHERE
`snd_transno` IN (
SELECT
CONVERT(pwcashout_transno USING latin1)
FROM
st_aeps_transaction_master a
WHERE
a.`entry_date` >= '2022-09:29 13:00:00'
AND a.entry_date <= '2022-09-29 13:30:00'
)
But this is a bit of a hack. It's always better when doing your table design to make the character sets and collations of CHAR()
and VARCHAR()
columns match. This is especially true if you JOIN on them or use them on IN()
or =
clauses.
Of course, redefining the tables may not be possible for your application.
CodePudding user response:
If the query from OJones does not work, then avoid IN ( SELECT ... )
by doing this:
SELECT log.*
FROM
( SELECT CONVERT(a.pwcashout_transno USING latin1) AS pt
FROM st_aeps_transaction_master a
WHERE a.entry_date >= '2022-09:29 13:00:00'
AND a.entry_date < '2022-09-29 13:00:00'
INTERVAL 30 MINUTE
) AS x
JOIN st_aepsrequest_log AS log ON log.snd_transno = x.pt
Then, this index may help: INDEX(entry_date, pwcashout_transno)
Note: If there could be multiple rows with the same pwcashout_transno
, then the inner (derived) query may need DISTINCT
.