Hi I am trying to run a Kubeflow pipeline.
Two steps will run in parallel and dump data to two different folders of PVC, then the third component will collect data from those to folders and merge them together and dump the merged data to another PVC folder.
Here are my pipeline codes:
vop = dsl.VolumeOp(
name='no2-pvc',
resource_name = "no2-pvc",
size="100Gi",
modes = dsl.VOLUME_MODE_RWO
)
##LOADING POSITIVE DATA##
load_positive_data = dsl.ContainerOp(
name='load_positive_data',
image=load_positive_data_image,
command="python",
arguments=[
"/app/load_positive_data.py",
],
pvolumes={"/mnt/positive/": vop.volume}).apply(gcp.use_gcp_secret("user-gcp-sa"))
##LOADING NEGATIVE DATA##
load_negative_data = dsl.ContainerOp(
name='load_negative_data',
image=load_negative_data_image,
command="python",
arguments=[
"/app/load_negative_data.py",
],
pvolumes={"/mnt/negative/": vop.volume}).apply(gcp.use_gcp_secret("user-gcp-sa"))
##MERGING POSITIVE AND NEGATIVE DATA##
marge_pos_neg_data = dsl.ContainerOp(
name='marge_pos_neg_data',
image=marged_data_image,
command="python",
arguments=[
"/app/merge_neg_pos.py"
],
pvolumes = {"/mnt/positive/": load_negative_data.pvolume, "/mnt/negative/": load_positive_data.pvolume}
#volumes={'/mnt': vop.after(load_negative_data, load_positive_data)}
).apply(gcp.use_gcp_secret("user-gcp-sa")).after(load_positive_data, load_negative_data)
##PROCESSING MARGED DATA##
process_marged_data = dsl.ContainerOp(
name='process_data',
image=perpare_merged_data_image,
command="python",
arguments=[
"/app/prepare_all_dataset.py"
],
pvolumes = {"/mnt/pos_neg": marge_pos_neg_data.pvolume}
).apply(gcp.use_gcp_secret("user-gcp-sa")).after(marge_pos_neg_data)
load-positive-data and load-negative-data are working fine but the marge-pos-neg-data step is giving the following error:
This step is in Error state with this message:
task 'no2-pipeline-x5kpd.marge-pos-neg-data'
errored: Pod "no2-pipeline-x5kpd-2954674781" is invalid:
spec.volumes[3].name: Duplicate value: "no2-pvc"
Hoping for your help to resolve the issue.
CodePudding user response:
pvolumes={"/mnt/positive/": vop.volume}) and pvolumes={"/mnt/negative/": vop.volume}) was creating two separate pvc's.