python - N-dimensional histogram containing the maximum value of the weights that fall in each bin -


i have set of m points in n-dimensions, each of has associated "weight" value (basically, array of m floats). using numpy's histogramdd() can generate set's n-dimensional histogram.

if use weights parameter in histogramdd(), back:

the sum of weights belonging samples falling each bin.

the code below shows hot create these arrays:

import numpy np  # n-dimensional m points. n_dim, m = 3, 1000 points = np.random.uniform(0., 1., size=(m, n_dim))  # weight each point weights = np.random.uniform(0., 1., m)  # n-dimensional histogram. histo = np.histogramdd(points)[0] # histogram containing sum of weights in each bin. weights_histo = np.histogramdd(points, weights=weights)[0] 

instead of this, need create n-dimensional histogram points, value stored in each bin maximum weight value out of weights associated points fall within bin.

i.e.: need only maximum weight stored in each bin, not sum of weights.

how this?

there several binned_statistic functions in scipy.stats. 'max' 1 of default statistics, can use callable well.

import numpy np scipy.stats import binned_statistic_dd  # n-dimensional m points. n_dim, m = 3, 1000 points = np.random.uniform(0., 1., size=(m, n_dim))  # weight each point weights = np.random.uniform(0., 1., m)  weights_histo, bin_edges, bin_indices = binned_statistic_dd(points,                                                             weights,                                                             statistic=np.max,                                                             bins=5)  print weights_histo.shape # (5,5,5) 

Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -