Global Refinement of Positions and Uncertainties ------------------------------------------------ (H.L. McCallon 981012) The following outline describes a plan to refine 2MASS positions and position uncertainties taking advantage of global overlap information as contained in the database. The position corrections are obtained by "Martinizing" the position data. The name comes from the fact that Martin Weinberg originally proposed this type of approach a number of years ago. Prior to that step, however, the original position uncertainties must be adjusted to compensate for the fact that frame position uncertainty contribution is currently set to a constant in the pipeline processing. The described approach to achieve the needed uncertainty adjustments resulted from a number of discussions between myself and John Fowler and has elements in common with the position "Martinizing". Before getting into the position refinement itself, a method for extracting the needed data from the database and setting it up for efficient access is described. This is quite important given the huge volume of data involved. The technique steps through the database a tile at a time, computing what it needs and storing the results in memory. After all the needed data is accumulated, the refinement is done using arrays in memory. The position uncertainties of point sources coming out of the pipeline will be adjusted to make them reflect actual discrepancy dispersions by multiplying their error variances by a factor denoted "variance factor". These variance factors will be determined for a dozen Dec segments in each scan. Positions at the scan segment level will be adjusted by a global minimization of scan overlap discrepancies weighted against ACT reference star match information. This refinement technique could be applied at any point in the survey, up to and including full sky coverage. The greater the contiguous coverage, the better the results. A test case, using prototype code and data from only a single night (971116n), showed a significant improvement. The proposed steps are as follows: I. Step through tiles and, for the primary scan covering each, compute difference information w.r.t. ACT's and overlapping scans. A. Pull positions of the four corners for the primary scan from the database. B. Pull all sources from any scan within the database which have high quality extractions and are enclosed by the primary-scan four corner points. C. Compute average position variances (RA and Dec) for primary scan and load into the arrays VarRA and VarDec indexed by tile. In most instances each will be the average of a group of constants. This occurs because: 1) Only "high quality extractions" (I-B) are used. 2) PROPHOT applies a floor to the position uncertainties which generally clips the high quality extractions at a minimum allowed value. 3) The contribution of position reconstruction of the frames to the original variance is constant. This, of course, is what gives rise to the need for the variance factors. D. Match primary-scan sources to overlapping-scan sources and to ACT's. E. Divide matches from step I-D into 12 segments by Dec. These will henceforth be referred to as "tile segments" and will be the basic unit of position adjustment. F. For each tile-segment overlap check existing entries in the idtsl and idtsh arrays to see if the overlap combination has already been processed. Note that each overlap combination will show up twice but need only be processed once. Assuming the combination has not already been processed, compute the following variables, increment the overlap index "iol" and load into the indicated arrays (otherwise skip to G): 1) Number of matches => nMCHo(iol) 2) Sums of match differences in RA => SdRAo(iol) 3) Sums of match differences in Dec => SdDECo(iol) 4) Sums of match difference squares in RA => SdRA2o(iol) 5) Sums of match difference squares in Dec => SdDC2o(iol) In order to provide easy access to the overlap data two additional arrays with the same indexing are maintained. These are 6) Unique ID of lower tile segment in overlap => idtsl(iol) 7) Unique ID of higher tile segment in overlap => idtsh(iol) The terms "lower" and "higher" here simply refer to the tile-segment ID's themselves. The sign of the difference sums (SdRAo and SdDECo) should always reflect the position of the tile-segment with the larger ID minus that with the smaller ID. G. For each tile-segment identify the ACT matches, compute the following variables and load to tile-segment indexed (its) arrays as indicated: 1) Number of ACT matches => nMCHa(its) 2) Sums of ACT match differences in RA => SdRAa(its) 3) Sums of ACT match differences in DEC => SdDECa(its) II. Prepare difference data for efficient access. A. Generate two sets of keys allowing indexed access to the overlap arrays (I-F) in: 1) ascending idtsl order => array "idxtsl" 2) ascending idtsh order => array "idxtsh" B. Step through the idtsl overlap array in ascending order using the idxtsl keys and build a tile-segment indexed array "tsl1st" giving the location of the first occurrence of that tile-segment in idts1 order. Repeat to build a tile-segment indexed array "tsh1st" for first occurrence in idtsh order. Note: One can now access tile-segment unique data directly via tile-segment index. The shared overlap data can be obtained by first using the tsl1st key to jump into the idxtsl array at the first occurrence of that tile-segment then pulling in all data until idtsl changes. The remaining overlap data can be obtained in the same manner using the tsh1st key to jump into idxtsh array at the first occurrence of that tile-segment then pulling in all data until idtsh changes. III. Compute variance factors for each tile segment. The RA and Dec factors can be computed independently. The steps described are applied to RA and then Dec to generate separate sets of variance factors. Since the purpose of the variance factors is to allow the contribution of the frame position uncertainties to vary, only that portion of the original variance should be multiplied by the variance factor. This can be achieved by representing the adjusted variance as follows: var_adj = var_orig + V0*(vfac-1) where: var_adj = adjusted variance var_orig = original variance V0 = constant frame variance used in pipeline vfac = variance factor Note: A variance factor of one indicates no adjustment. A variance factor of zero would reduce the contribution of the frame uncertainty to zero. A. The summation of chi-squares over n in each overlap can be written as: sum_chisq/n = (sum_dx^2)/{N*[Var1+V0*(vf1-1) +Var2+V0*(vf2-1)]} where: sum_dx^2 = sum of overlap differences squared N = number of matched points in overlap vf1 = variance factor of first tile segment vf2 = variance factor of 2nd tile segment Var1 = average original variance from first tile (I-C) Var2 = average original variance from 2nd tile (I-C) V0 = constant frame variance used in pipeline B. Form sum "S" of differences which, if driven to zero, would drive the sum_chisq/n values to one. It will not, of course, be possible to drive "S" to zero, but it can be minimized. Each term in "S" is associated with an overlap and is weighted by the number of matches in that overlap. S =SUM{Nij*[(sum_dx^2)ij -Nij*(Vari+V0*(vfi-1) +Varj+V0*(vfj-1))]^2} where i ranges over the tile segments and j over the overlaps C. Form a set of simultaneous linear equations by taking the partial derivative of S w.r.t. each of the tile-segment variance factors in turn and setting each partial equal to zero. D. Solve simultaneous linear equations above for the desired variance factors which minimize the sum_chisq/n differences from one. This step is complicated by the fact that for the entire sky (limiting the northern survey to > -12 deg and the southern to < +18 deg) there are 893364 tile-segments. Although the matrix to be inverted is extremely large (893364x893364) it is also extremely sparse with typically only about 2-3 non-zero elements per row. The number of non-zero elements goes up near the poles but still remains very small. 1. The matrix will have to be stored and solved using sparse matrix techniques. IMSL has some routines which look promising assuming they can handle a problem this large. Need to check with Martin Weinberg on the upper limits of the sparse matrix approach. 2. Should the matrix prove too large even for sparse matrix techniques, an alternate approach might be to solve the problem as described but with manageable subsets of the entire sky. Each subset could be made up of a number of adjacent Dec bands. This should leave things close enough to finish off with iteration. E. Iteration could proceed as follows: 1. Determine what variance factor for the primary tile segment would make each of the overlap chisq/n values go to one. Let vfvij represent the vote of overlap j on primary variance factor i; then: vfvij = [(sum_dx^2)ij/Nij -Vari -Varj]/V0 -vfj_lst +2 where vfj_lst is the variance factor for tile-segment j from the last iteration 2. Compute a weighted average of the vfvij overlap votes to obtain what a full-step value (vfi_fs) for the new iteration of vfi would be. vfi_fs = (sum Nij*vfvij)/(sum Nij) where the sums are over the j overlaps. 3. Take a portion of the indicated full step to get the new value for vfi: vfi_new = vfi_lst +alf*(vfi_fs -vfi_lst) where alf is the factor which specifies the fraction of the full step to use. It usually helps to start alf out low and let it approach one as the iteration converges. F. Clip variance factors to prevent using unreasonably low variances as weights in the position correction (step IV). Define VarMin to be the minimum reasonable frame variance. If vfi*V0adjusted positions and associated uncertainties 2 => adjusted uncertainties for non-adjusted positions nTile = tile number RA = uncorrected RA DEC = uncorrected DEC SigRA = original RA sigma SigDEC= original Dec sigma Outputs: (for nflag=1) RAc = corrected RA DECc = corrected Dec SigRAa= adjusted sigma for corrected RA SgDECa= adjusted sigma for corrected Dec (for nflag=2) RAc = uncorrected RA DECc = uncorrected Dec SigRAa= adjusted sigma for uncorrected RA SgDECa= adjusted sigma for uncorrected Dec