Many machine learning algorithms are based on the assumption that training examples are drawn identically and independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample variance and sample error bounds in a networked setting, providing both improved and new results. An important component of our approach is formed by fficient
sample weighting schemes, which have a beneficial effect on variance and concentration bound analysis. For powerlaw graphs, this can improve the sample complexity by both a
constant or polynomial factor depending on the situation.