We employ k-nearest neighbor algorithm (KNN) for photometric redshift measurement of quasars with the Fifth
Data Release (DR5) of the Sloan Digital Sky Survey (SDSS). KNN is an instance learning algorithm where
the result of new instance query is predicted based on the closest training samples. The regressor do not use
any model to fit and only based on memory. Given a query quasar, we find the known quasars or (training
points) closest to the query point, whose redshift value is simply assigned to be the average of the values of its k
nearest neighbors. Three kinds of different colors (PSF, Model or Fiber) and spectral redshifts are used as input
parameters, separatively. The combination of the three kinds of colors is also taken as input. The experimental
results indicate that the best input pattern is PSF + Model + Fiber colors in all experiments. With this pattern,
59.24%, 77.34% and 84.68% of photometric redshifts are obtained within ▵z < 0.1, 0.2 and 0.3, respectively. If
only using one kind of colors as input, the model colors achieve the best performance. However, when using two
kinds of colors, the best result is achieved by PSF + Fiber colors. In addition, nearest neighbor method (k = 1)
shows its superiority compared to KNN (k ≠ 1) for the given sample.
We investigate two methods: kernel regression and nearest neighbor algorithm for photometric redshift estimation
with the quasar samples from SDSS (the Sloan Digital Sky Survey) and UKIDSS (the UKIRT Infrared Deep Sky
Survey) databases. Both kernel regression and nearest neighbor algorithm belong to the family of instance-based
learning algorithms, which store all the training examples and "delay learning" until prediction time. The major
difference between the two algorithms is that kernel regression is a weighted average of spectral redshifts of the
neighbors for a query point while nearest neighbor algorithm utilizes the spectral redshift of the nearest neighbor
for a query point. Each algorithm has its own advantage and disadvantage. Our experimental results show that
kernel regression obtains more accurate predicting results, and nearest neighbor algorithm shows its superiority
especially for more thinly spread data, e.g. high redshift quasars.
With the large-scale multicolor photometry and fiber-based spectroscopy projects carried out, millions of uniform
samples are available to the astronomers. Based on this situation, we have developed an automatic system to
estimate photometric redshifts for both galaxies and quasars. In this paper we give an exhaustive introduction
of the system. We first describe a series of methods integrated in this system, such as template fitting, color-magnitude-redshift relation, polynomial regression, support vector machines and kernel regression. The merits
and demerits of these approaches have been indicated. Therefore, users can choose some suitable algorithm to
estimate photometric redshifts according to data characteristics and science requirements. Then, we present
a case study to illustrate how the system works. In order to build a more robust system of increasing the
accuracy and speed of photometric redshift estimation, we pay special attention to algorithm choice and data
preparation. From the user's viewpoint, an easy used interface will be provided. Finally, we point out the
promising techniques of measuring photometric redshifts and the application prospects of this system. In the
future, the system will become an essential tool for automatedly determining photometric redshifts in the study
of the large-scale structure of the Universe and the formation and evolution of galaxies.
The Sloan Digital Sky Survey (SDSS) is an ambitious photometry and spectra project, providing huge and
abundant samples for photometric redshift estimation. We employ polynomial regression to estimate photometric
redshifts using 330,000 galaxies with known spectroscopic redshifts from SDSS Release Four spectroscopic catalog,
and compare three polynomial regressionmethods, i.e. linear regression, quadratic regression and cubic regression
with different samples. This technique gives absolute convergence in a finite number of steps, represents better
fit with fewer coefficients and yields the result as a mathematical expression. This method is much easier to
use and understand than other empirical methods for astronomers. Our result indicates that equally or more
powerful accuracy is provided, moreover, the best r.m.s. dispersion of this approach is 0.0256. In addition, the
comparison between our results with other works is addressed.
A new application framework for virtual observatory (VO) is designed for discovering unknown knowledge from thousands of astronomical catalogs which have already released and are accessible through VO services. The framework consist of two new technologies to seamlessly associate data queried from SkyNode supported databases with data mining (DM) algorithms, which either come from third-party software or are developed directly above the framework. The first one is a high level programming language, called Job Description Language (JDL), for describing jobs for data accessing and numerical computation based on web services. The second technology is a computation component standard with both local and web service invocation interface, which is named as CompuCell. It is a universal solution for integrating arbitrary third-party DM software into the framework so as to invoke them directly in JDL program. We implement a prototype with a JDL supported portal and achieve clustering algorithm in CompuCell components. We combine a series of data mining procedures with a data access procedure by programming in JDL on the portal. A scientific research, which recognizes OB associations from 2MASS catalog, is treated as a demonstration for the prototype. It confirms the feasibility of the application framework.
The advantages of being able to accurately measure redshift with photometric data are of great importance
for studying cosmology, large scale structure of the Universe, determination of fundamental astrophysical quantities
and so on, because photometric redshifts may provide approximate distances to the enormous set of
objects. At present various algorithms for photometric redshifts have been investigated. This is induced us
to develop a software platform that integrates different algorithms of estimating photometric redshifts, such
as color-magnitude-redshift (CMR), Support Vector Machines (SVMs), HyperZ and Artificial Neural Networks
(ANNs). The requirements of the software platform, architectural issues are addressed and its framework design
implemented are discussed. It provides a user-friendly interface, by which users can choose the method they
like, upload their own data, and then get their needed result by clicking a mouse. This framework is flexible and
extensible enough to measure photometric redshifts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.