I have to use goelastic library inserting the datas bulkly from coming pulsar. But i have a problem.
Firstly, pulsar send 1000 datas per partial bulkly. Then when i insert the elastic, there are a problem sometimes. This problem is attached. This problem cause data loss. Thanks for answer...
ERROR: circuit_breaking_exception: [parent] Data too large, data for [indices:data/write/bulk[s]] would be [524374312/500mb], which is larger than the limit of [510027366/486.3mb], real usage: [524323448/500mb], new bytes reserved: [50864/49.6kb], usages [request=0/0b, fielddata=160771183/153.3mb, in_flight_requests=50864/49.6kb, model_inference=0/0b, eql_sequence=0/0b, accounting=6898128/6.5mb]
This section is bulk code.
func InsertElastic(y []models.CP, ElasticStruct *config.ElasticStruct) {
fmt.Println("------------------")
bi, err := esutil.NewBulkIndexer(esutil.BulkIndexerConfig{
Index: enum.IndexName,
Client: ElasticStruct.Client,
FlushBytes: 10e 6,
})
if err != nil {
panic(err)
}
start := time.Now().UTC()
for _, x := range y {
data, err := json.Marshal(x)
if err != nil {
panic(err)
}
err = bi.Add(
context.Background(),
esutil.BulkIndexerItem{
Action: "index",
Body: bytes.NewReader(data),
OnSuccess: func(ctx context.Context, item esutil.BulkIndexerItem, res esutil.BulkIndexerResponseItem) {
i
},
OnFailure: func(ctx context.Context, item esutil.BulkIndexerItem, res esutil.BulkIndexerResponseItem, err error) {
if err != nil {
log.Printf("ERROR: %s", err)
} else {
log.Printf("ERROR: %s: %s", res.Error.Type, res.Error.Reason)
}
},
},
)
if err != nil {
log.Fatalf("Unexpected error: %s", err)
}
x
}
if err := bi.Close(context.Background()); err != nil {
log.Fatalf("Unexpected error: %s", err)
}
dur := time.Since(start)
fmt.Println(dur)
fmt.Println("Success writing data to elastic : ", i)
fmt.Println("Success incoming data from pulsar : ", x)
fmt.Println("Difference : ", x-i)
fmt.Println("Now : ", time.Now().UTC().String())
if i < x {
fmt.Println("FATAL")
}
fmt.Println("------------------")
}
CodePudding user response:
Tldr;
It seems like you do not have enough JVM heap on your node.
You are hitting a circuit breaker to avoid Elasticsearch to be Out Of Memory(OOM).
Solution(s)
- Increase the JVM memory, you will find here some documentation to size your nodes.
- Smaller bulk request