i want to split the excel sheet based on the vaues in a specific column as a result i will have multiple excel sheets. using R or Python can anyone suggest a solution?
for example, this is the structure of data i have now, i have a number of city names, so it is taking too much time to sort and copy paste data
clients | City Name |
---|---|
Client_name1 | City_1 |
Client_name2 | City_2 |
Client_name3 | City_1 |
Client_name4 | City_2 |
Client_name5 | City_1 |
Client_name6 | City_2 |
and result should be like this,
TABLE_CITY_1
clients | City Name |
---|---|
Client_name1 | City_1 |
Client_name3 | City_1 |
Client_name5 | City_1 |
TABLE_CITY_2
clients | City Name |
---|---|
Client_name2 | City_2 |
Client_name4 | City_2 |
Client_name6 | City_2 |
CodePudding user response:
You can use Python
and Pandas
to get what you want:
import pandas as pd
data = pd.read_excel('data.xlsx')
with pd.ExcelWriter('cities.xlsx') as xlsx:
for name, df in data.groupby('City Name'):
df.to_excel(xlsx, sheet_name=name, index=False)
will produce one excel file (cities.xlsx) with two sheets: City_1 and City_2
CodePudding user response:
With R. Bit more general ( / reproducible) example based on a file provided by readxl package. If named list of dataframes is passed to write_xlsx()
it will store each in a separate sheet, default output file is temporary file in R session tmp directory.
library(readxl)
library(writexl)
library(dplyr)
# path of datasets.xlsx example file
xlsx_path <- readxl_example("datasets.xlsx")
# sheets in input file:
excel_sheets(xlsx_path)
#> [1] "iris" "mtcars" "chickwts" "quakes"
iris_xlsx <- read_excel(xlsx_path,sheet = "iris")
slice_sample(iris_xlsx, n = 10)
#> # A tibble: 10 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.6 2.9 3.6 1.3 versicolor
#> 2 5.5 4.2 1.4 0.2 setosa
#> 3 6.1 2.8 4 1.3 versicolor
#> 4 6.7 3.3 5.7 2.1 virginica
#> 5 5.1 2.5 3 1.1 versicolor
#> 6 5.4 3.7 1.5 0.2 setosa
#> 7 5.4 3.4 1.5 0.4 setosa
#> 8 4.7 3.2 1.3 0.2 setosa
#> 9 7 3.2 4.7 1.4 versicolor
#> 10 5.1 3.8 1.6 0.2 setosa
# store file in R session tmp folder
tmp_out <- iris_xlsx %>%
group_by(Species) %>%
{{setNames(group_split(.), group_keys(.)[[1]])}} %>%
write_xlsx()
# sheets in resulting file
excel_sheets(tmp_out)
#> [1] "setosa" "versicolor" "virginica"
# open file location in file explorer
# browseURL(dirname(tmp_out))
Created on 2023-01-26 with reprex v2.0.2