I'm mostly using Scala for all my data engineering needs and I'm already feeling that there is not many support when it comes to using some good libraries that are available in Pyhton. One example is the plotting libraries. Seems like there is none in Scala. The ones that I find in GitHub like the matplotlib4j seems to be outdated or no one is working on it anymore.
I realized that we can use Javascript libraries and I came across D3 and I would like to try it out. I have this example below:
%angular
<div>
<svg class="chart"></svg>
</div>
<script>
function useD3() {
var data = [4, 8, 15, 16, 23, 42];
var width = 420,
barHeight = 20;
var x = d3.scale.linear()
.domain([0, d3.max(data)])
.range([0, width]);
var chart = d3.select(".chart")
.attr("width", width)
.attr("height", barHeight * data.length);
var bar = chart.selectAll("g")
.data(data)
.enter().append("g")
.attr("transform", function(d, i) { return "translate(0," i * barHeight ")"; });
bar.append("rect")
.attr("width", x)
.attr("height", barHeight - 1);
}
if (window.d3) {
useD3();
} else {
var sc = document.createElement('script');
sc.type = 'text/javascript';
sc.src = 'https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.16/d3.min.js';
sc.onload = useD3;
sc.onerror = function(err) { alert(err); }
document.getElementsByTagName('head')[0].appendChild(sc);
}
</script>
It renders what it should, but my question is how could I get the data from my Spark Scala into this Javascript? For example., I have the following scala tuple printed out from one of my DataFrame computations:
Threshold = 0.0, Features = 48
Threshold = 0.05, Features = 36
Threshold = 0.1, Features = 35
Threshold = 0.15, Features = 34
Threshold = 0.2, Features = 34
Threshold = 0.25, Features = 34
Threshold = 0.3, Features = 34
Threshold = 0.35, Features = 34
Threshold = 0.4, Features = 34
Threshold = 0.45, Features = 32
Threshold = 0.5, Features = 30
I would like to plot this with Threshold on the x-axis and Features on the y-axis. How could I do this?
CodePudding user response:
d3's quite a low-level library that can let you build up very complex interactive visualisations using data to visual variable (usually svg element attributes) mappings.
However, I would suggest if you want to just use 'standard' charts like scatterplots, bar charts etc in javascript use a d3-based charting library like britecharts, billboard etc that have these available in a much more convenient and shorter syntax, rather than 're-inventing the wheel' by building them in d3 yourself where you'll have to plot the data, calculate the ranges/scales and set-up the axes (a surprising amount of work).
https://britecharts.github.io/britecharts/tutorial-scatter-plot.html
https://naver.github.io/billboard.js/
Others are available but these are actively maintained