January 2016 • 2016MNRAS.455..626M
Abstract • We present the Signal Detection using Random-Forest Algorithm (SIDRA). SIDRA is a detection and classification algorithm based on the Machine Learning technique (Random Forest). The goal of this paper is to show the power of SIDRA for quick and accurate signal detection and classification. We first diagnose the power of the method with simulated light curves and try it on a subset of the Kepler space mission catalogue. We use five classes of simulated light curves (CONSTANT, TRANSIT, VARIABLE, MLENS and EB for constant light curves, transiting exoplanet, variable, microlensing events and eclipsing binaries, respectively) to analyse the power of the method. The algorithm uses four features in order to classify the light curves. The training sample contains 5000 light curves (1000 from each class) and 50 000 random light curves for testing. The total SIDRA success ratio is ≥90 per cent. Furthermore, the success ratio reaches 95-100 per cent for the CONSTANT, VARIABLE, EB and MLENS classes and 92 per cent for the TRANSIT class with a decision probability of 60 per cent. Because the TRANSIT class is the one which fails the most, we run a simultaneous fit using SIDRA and a Box Least Square (BLS)-based algorithm for searching for transiting exoplanets. As a result, our algorithm detects 7.5 per cent more planets than a classic BLS algorithm, with better results for lower signal-to-noise light curves. SIDRA succeeds to catch 98 per cent of the planet candidates in the Kepler sample and fails for 7 per cent of the false alarms subset. SIDRA promises to be useful for developing a detection algorithm and/or classifier for large photometric surveys such as TESS and PLATO exoplanet future space missions.
Links