Home > Back-end >  how to split excel sheet based on string values in a specific column
how to split excel sheet based on string values in a specific column

Time:01-27

i want to split the excel sheet based on the vaues in a specific column as a result i will have multiple excel sheets. using R or Python can anyone suggest a solution?

for example, this is the structure of data i have now, i have a number of city names, so it is taking too much time to sort and copy paste data

clients City Name
Client_name1 City_1
Client_name2 City_2
Client_name3 City_1
Client_name4 City_2
Client_name5 City_1
Client_name6 City_2

and result should be like this,

TABLE_CITY_1

clients City Name
Client_name1 City_1
Client_name3 City_1
Client_name5 City_1

TABLE_CITY_2

clients City Name
Client_name2 City_2
Client_name4 City_2
Client_name6 City_2

CodePudding user response:

You can use Python and Pandas to get what you want:

import pandas as pd

data = pd.read_excel('data.xlsx')
with pd.ExcelWriter('cities.xlsx') as xlsx:
    for name, df in data.groupby('City Name'):
        df.to_excel(xlsx, sheet_name=name, index=False)

will produce one excel file (cities.xlsx) with two sheets: City_1 and City_2

CodePudding user response:

With R. Bit more general ( / reproducible) example based on a file provided by readxl package. If named list of dataframes is passed to write_xlsx() it will store each in a separate sheet, default output file is temporary file in R session tmp directory.

library(readxl)
library(writexl)
library(dplyr)

# path of datasets.xlsx example file
xlsx_path <- readxl_example("datasets.xlsx")

# sheets in input file:
excel_sheets(xlsx_path)
#> [1] "iris"     "mtcars"   "chickwts" "quakes"

iris_xlsx <- read_excel(xlsx_path,sheet = "iris")
slice_sample(iris_xlsx, n = 10)
#> # A tibble: 10 × 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species   
#>           <dbl>       <dbl>        <dbl>       <dbl> <chr>     
#>  1          5.6         2.9          3.6         1.3 versicolor
#>  2          5.5         4.2          1.4         0.2 setosa    
#>  3          6.1         2.8          4           1.3 versicolor
#>  4          6.7         3.3          5.7         2.1 virginica 
#>  5          5.1         2.5          3           1.1 versicolor
#>  6          5.4         3.7          1.5         0.2 setosa    
#>  7          5.4         3.4          1.5         0.4 setosa    
#>  8          4.7         3.2          1.3         0.2 setosa    
#>  9          7           3.2          4.7         1.4 versicolor
#> 10          5.1         3.8          1.6         0.2 setosa

# store file in R session tmp folder
tmp_out <- iris_xlsx %>% 
  group_by(Species) %>% 
  {{setNames(group_split(.), group_keys(.)[[1]])}} %>% 
  write_xlsx()

# sheets in resulting file
excel_sheets(tmp_out)
#> [1] "setosa"     "versicolor" "virginica"

# open file location in file explorer
# browseURL(dirname(tmp_out))

Created on 2023-01-26 with reprex v2.0.2

  • Related