Home > database >  SQL Ntile() how to determine bucket size when handling imbalanced data
SQL Ntile() how to determine bucket size when handling imbalanced data

Time:08-21

I have found a post that is similar to my question: enter image description here

CodePudding user response:

for sql server it is mentioned in documentation "If the number of rows in a partition is not divisible by integer_expression, this will cause groups of two sizes that differ by one member. Larger groups come before smaller groups in the order specified by the OVER clause. For example if the total number of rows is 53 and the number of groups is five, the first three groups will have 11 rows and the two remaining groups will have 10 rows each. If on the other hand the total number of rows is divisible by the number of groups, the rows will be evenly distributed among the groups. For example, if the total number of rows is 50, and there are five groups, each bucket will contain 10 rows." https://docs.microsoft.com/en-us/sql/t-sql/functions/ntile-transact-sql?f1url=?appId=Dev14IDEF1&l=EN-US&k=k(ntile_TSQL);k(sql13.swb.tsqlresults.f1);k(sql13.swb.tsqlquery.f1);k(MiscellaneousFilesProject);k(DevLang-TSQL)&rd=true&view=sql-server-ver16

for Oracle "The number of rows in the buckets can differ by at most 1. The remainder values (the remainder of number of rows divided by buckets) are distributed one for each bucket, starting with bucket 1." https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions101.htm

  • Related