Impact of surveys, impact on peer-reviewed science

cchandle
Posts: 3
Joined: Fri Mar 28, 2014 1:16 pm

Impact of surveys, impact on peer-reviewed science

Postby cchandle » Fri Mar 28, 2014 1:31 pm

We have examined the papers using the data products from NRAO surveys, and looked at the number of unique authors of those papers who appear in the NRAO User Database versus those who do not. We consider this a measure of the increased impact of high level data products on the non-traditional radio astronomy community. The numbers are as follows:

Total number of users in NRAO database: ~6200
Total unique authors using survey data products: ~3800
Total authors in database: ~1800
Total authors not in database: ~2000

(Note that not all individuals in the user database are on successful proposals to use NRAO instruments, so some of those I might class as "non-traditional" as well, but let's not worry about that for now.)

Some conclusions:

- Survey data products effectively increase our user community by ~2000
- Approximately 45% of "all" users use survey data products
- Approximately 30% of "traditional" users use survey data products
- "Non-traditional" users comprise ~25% of all users

Note that doing a similar exercise for the authors of VLASS White Papers (178 unique authors in total), we find that 38 (~20%) of the authors are not in our user database.

I will be using these numbers to guide the impact of any VLA Sky Survey should be allowed to have on peer-reviewed science, in the sense that I would not like to see a sky survey take more than ~25% of the total available science hours in a given year (i.e., no more than ~1500 hours per year). Thus, in my view, a multi-epoch approach covering several years will minimize the impact on peer-reviewed science as well as addressing transient science goals, thereby maximizing the scientific impact for the entire community.

Gordon Richards has asked whether running in a survey mode can increase the total available science hours in a year, since it might be more efficient than standard operations. My response is: "By a little, maybe. We will continue to need time for maintenance and software work, and probably would also need to remain flexible about scheduling DDTs. We might be able to achieve 80% observing efficiency (i.e., fraction of all hours for science) during a survey, if we eliminated all other commissioning activities (=> no new capabilities or observing modes to be offered for future semesters, while the survey was being done)."

Claire Chandler
Deputy Assistant Director for Science, NM Operations, and VLASS Project Director

rlw@stsci.edu
Posts: 13
Joined: Mon Mar 03, 2014 6:42 pm

Re: Impact of surveys, impact on peer-reviewed science

Postby rlw@stsci.edu » Sat Mar 29, 2014 4:52 pm

Hi Claire,
Very interesting & informative post. But I think you have underestimated the usage of the survey data products. Are you only counting papers that are in the NRAO database?

I did an independent analysis for just NVSS and FIRST. I used ADS to extract a list of the authors on all papers that cite either the NVSS survey paper (Condon et al. 1998) or one of the basic FIRST papers (Becker et al. 1995, White et al. 1997). I believe the vast majority of these papers actually used the survey data, since there were not many science results in those papers. (So e.g. there should be few citations along the lines of "White et al. (1997) showed that....") The only common reason to cite either of these papers is to report a result using the survey data products (images or catalogs).

There are a total of 3522 papers in ADS that cite NVSS and/or FIRST. Those papers have a total of 9086 unique authors. That is much larger than the number of authors you quote (3800). There could be a little double-counting here, since some authors publish using more than one version of their name/initials; but even if I compare only last names, there are still 6925 unique authors among these papers. (And that's certainly an underestimate for people with common last names like "Smith" or "White".)

In fact, it is likely that even 9000 is an underestimate, because we know there is a substantial number of papers that simply say they used data from the FIRST survey or NVSS without actually adding a reference to the survey paper. It is quite easy to use the catalogs or images without ever referring to the papers. Of the 654 papers in ADS that mention "NVSS" in the abstract and were published after 1998, only 300 actually cite the Condon et al. (1998) paper. So if anything these citation numbers underestimate the usage of the survey data.

I'm not sure how to do a definitive comparison of the usage and impact of the surveys compared with an equivalent time spent on smaller user proposals. Perhaps one approach that would be worth trying would be to examine the publication statistics for all papers associated with VLA observations that were taken between 1995 and 2002 (the approximate time span of FIRST and NVSS). Those 8 years should include about 70,000 hours of observing (that needs to be reduced for engineering time). Over that time FIRST+NVSS observed a total of about 7,000 hours (that's a guess, I don't know the exact number), or about 10% of the time. It should be possible to compare the number of publications, number of unique first authors, total number of unique authors, etc., over that period to compare the impact of the time spent on surveys to the 9x greater time spent on other proposals.

I could also argue that the comparison ought to be made to just the bottom 10% of the executed proposals, since presumably the weakest proposals are the ones that get bumped off the schedule (or delayed) due to competition with the surveys. That does presume that the TAC does a perfect job of ranking the proposals, however, which I'm sure we all occasionally have doubts about. :-) So I think it would be fair to compare the survey impact to the average proposal impact.

Below are some more statistics from my analysis of the papers that cite NVSS or FIRST, in case anyone is interested.

Analysis of 3522 papers citing FIRST and/or NVSS (as of 2014 March 29)
9086 unique authors
1876 unique first authors
6925 unique authors (using last names only)
1666 unique first authors (using last names only)

Here is a list of the top 20 authors in various categories: authors (any position in author list), first authors, and the same but using only the last name (not the initials) to identify authors.

Code: Select all

   Authors                   1st Author                Author (last name only)   1st Author (last name only)
1  (139) Schneider D. P.     (17) Verkhodanov O. V.    (148) Schneider           (17) Verkhodanov         
2  (99) Richards G. T.       (15) Machalski J.         (136) White               (16) Richards           
3  (98) Strauss M. A.        (14) Brotherton M. S.     (128) Richards            (16) Miller             
4  (84) Becker R. H.         (14) De Breuck C.         (120) Smith               (16) Taylor             
5  (73) White R. L.          (14) Jamrozy M.           (104) Becker              (15) Jackson             
6  (70) Brandt W. N.         (13) Planck Collaboration (101) Anderson            (15) Machalski           
7  (70) Fan X.               (13) Frey S.              (98) Strauss              (14) Liu                 
8  (67) Anderson S. F.       (13) Condon J. J.         (95) Taylor               (14) Wang               
9  (67) Hall P. B.           (13) Paredes J. M.        (74) York                 (14) De Breuck           
10 (65) York D. G.           (13) Sadler E. M.         (73) Brandt               (14) Jamrozy             
11 (61) Brinkmann J.         (12) Best P. N.           (72) Scott                (14) Brotherton         
12 (60) Ivezić Ž.            (12) Govoni F.            (72) Fan                  (13) Planck Collaboration
13 (50) Sadler E. M.         (12) Mickaelian A. M.     (71) Brinkmann            (13) Sadler             
14 (48) Gunn J. E.           (12) Richards G. T.       (70) Röttgering           (13) Frey               
15 (47) Röttgering H. J. A.  (12) Stern D.             (69) Hall                 (13) Cohen               
16 (47) Giovannini G.        (11) Caccianiga A.        (69) Jackson              (13) Paredes             
17 (47) Stern D.             (11) Miller N. A.         (67) Wang                 (13) Condon             
18 (46) Jarvis M. J.         (11) Masetti N.           (62) Jones                (13) Mickaelian         
19 (46) Lupton R. H.         (10) Jarvis M. J.         (62) Davies               (13) Stern               
20 (45) Taylor G. B.         (10) Magliocchetti M.     (62) Myers                (12) Best               

jlazio
Posts: 8
Joined: Thu Mar 20, 2014 11:13 pm

Re: Impact of surveys, impact on peer-reviewed science

Postby jlazio » Wed Apr 02, 2014 1:24 am

Another approach, fraught with the usual uncertainties. I asked ADS to return peer-reviewed papers published in 1995 or later, sorted by citation count. Of the top 100 papers

#22. NVSS paper

#94. first FIRST paper (1995)


Other than WMAP papers, the only other "radio astronomy" papers to crack the top 100 are

#38. Urry, C. Megan; Padovani, Paolo "Unified Schemes for Radio-Loud Active Galactic Nuclei" 1995PASP..107..803U

#93. Kalberla, P. M. W.; Burton, W. B.; Hartmann, Dap; Arnal, E. M.; Bajaja, E.; Morras, R.; Pöppel, W. G. L. "The Leiden/Argentine/Bonn (LAB) Survey of Galactic HI. Final data release of the combined LDS and IAR surveys with improved stray-radiation corrections" 2005A&A...440..775K

elisabethmills
Posts: 13
Joined: Tue Mar 11, 2014 3:58 pm

Re: Impact of surveys, impact on peer-reviewed science

Postby elisabethmills » Wed Apr 02, 2014 3:48 pm

Two thoughts, possibly useful or possibly not.

First, on using citations to determine impact:

If we just care about the time spent on the VLA being 'useful' ( how many people use data taken by the VLA) then the above are good metrics (the maximum number of citations, or a large number of unique users in papers citing NVSS+FIRST) and surveys win easily.

But if the goal is to quantify and compare the impact of the time spent with the VLA, then the comparison should perhaps better reflect how 'influential' (how valuable are results obtained from these data) both survey and non-survey science on the VLA is. As Joe points out, there will always be caveats to any attempt to quantify this, but a starting metric could be, as Rick suggests, to compare the total citations to the NVSS+FIRST papers and to other papers using the VLA. Or, perhaps, to tweak Joe's search, the maximum number of citations for papers citing/mentioning NVSS (rather than the survey paper itself) should be compared to other VLA papers?

Second, on comparative legacies:

We are using as justification two surveys that have largely reigned unique and supreme for almost 2 decades.
If we take their impact as an indication of future community use of the VLASS, then I think it should be considered just how unique the VLASS will be on similar timescales. As I believe the consensus right now is to use S band with perhaps 2'' resolution, it should largely be unique even in the face of deeper, lower-resolution, and lower-frequency upcoming surveys by SKA pathfinders and Westerbork. But it can be argued that given that there will be these more sensitive, or (potentially, if ALL-SKY does not happen) larger-area surveys available at similar frequencies, that there will be some fraction of the proposed survey science for which these other surveys will be better than the VLASS.

If one uses metrics that assume the VLASS will necessarily have the lasting community impact of NVSS or FIRST, a very strong case should be made for its absolute uniqueness (how much of the science being proposed can only be done with the VLASS) as well as complementarity of the VLASS with these other surveys (how much of the proposed science can use e.g. EMU/WODAN+VLASS for added value) which will define the legacy of the VLASS.

rlw@stsci.edu
Posts: 13
Joined: Mon Mar 03, 2014 6:42 pm

Re: Impact of surveys, impact on peer-reviewed science

Postby rlw@stsci.edu » Wed Apr 02, 2014 11:33 pm

I've been working on additional analysis of publications and citations to compare the impact of FIRST+NVSS to other VLA projects. I still think that the best comparison would involve connecting papers based on VLA data back to the proposals, so that we know how much time got invested. I don't have that information but am doing what I can using the tools in ADS. I think the results make a convincing case that the time spent on surveys has a significantly larger impact than the average VLA program.

I'm looking at refereed papers published between 2000 and 2010 (inclusive). I picked those years to be past the bulk of the NVSS+FIRST observations (and the publication of their respective survey papers) and to include papers that are old enough to have accumulated a reasonable number of citations.

I put data in 2 categories based on the contents of the abstract and whether they cite NVSS and/or FIRST:

(1) Non-survey papers:
- Abstract mentions VLA (or Very Large Array)
- AND Abstract does not mention FIRST or NVSS

(2) Survey papers:
- Abstract mentions FIRST or NVSS
- OR Abstract does not mention FIRST, NVSS, or VLA, but paper cites FIRST/NVSS survey papers

Note that the non-survey criteria count even some papers that cite the FIRST or NVSS publications as non-survey papers. I looked at the 10 most-cited papers that cite the FIRST/NVSS papers but that mention only VLA (and not FIRST or NVSS) in the abstract. Two of them are really survey papers (they only use NVSS or FIRST data for their results), while eight use FIRST or NVSS as supplements (or to define samples) for new VLA observations. Counting all of these papers in the "non-survey" category can only underestimate the relative impact of the surveys.

Here are the counts of the number of publications and citations in those two categories:

Summary for 2000-2010 refereed papers

Code: Select all

Category   Papers Cites Cites/paper
Non-survey  1822  51490  28.3
Survey      1775  86341  48.6


The number of Survey and Non-Survey publications is similar over this 11-year time span. And the citation rate for the survey papers (i.e., those using FIRST or NVSS data) is much higher than the non-survey papers. Both the number of publications and their citation rates strongly support the argument that the impact of VLA time spent on surveys is much larger than VLA time spent on non-survey programs.

There are plenty of caveats around these numbers, of course. The large citation rate for the survey papers is certainly attributable to the high citation rate of SDSS papers, which relied mainly on FIRST data. But that is the point of doing these surveys, after all! The impact of the radio surveys in conjunction with surveys at other wavelengths (optical, infrared, X-ray, etc.) is very large.

The amount of observing time spent on all VLA projects published from 2000-2010 is certainly at least 10 times greater than the combined time spent on the FIRST+NVSS surveys (which I estimate at 7000 hours, comparable to our proposed VLASS surveys). Even if you count only those papers that actually mention FIRST or NVSS in the abstract and ignore all the papers that only cite the survey publications --- which surely grossly underestimates the impact of the surveys --- the numbers are still favorable for the surveys: 430 papers, 13183 citations, 30.7 cites/paper. Multiply those by 10 to normalize for the difference in observing time: the time spent in surveys has 2.5 times the impact of non-survey time in publications and citations.

This is still not a definitive publication & citation analysis. But I will say this: there is extremely strong evidence against the argument that the science out of the VLA is negatively impacted when surveys displace regular proposals. Surveys will enhance science at the JVLA, just as they have at every other modern observatory, which is why more and more time at all major observatories is being dedicated to large surveys.

rlw@stsci.edu
Posts: 13
Joined: Mon Mar 03, 2014 6:42 pm

Re: Impact of surveys, impact on peer-reviewed science

Postby rlw@stsci.edu » Wed Apr 02, 2014 11:55 pm

elisabethmills wrote:We are using as justification two surveys that have largely reigned unique and supreme for almost 2 decades.
If we take their impact as an indication of future community use of the VLASS, then I think it should be considered just how unique the VLASS will be on similar timescales. As I believe the consensus right now is to use S band with perhaps 2'' resolution, it should largely be unique even in the face of deeper, lower-resolution, and lower-frequency upcoming surveys by SKA pathfinders and Westerbork. But it can be argued that given that there will be these more sensitive, or (potentially, if ALL-SKY does not happen) larger-area surveys available at similar frequencies, that there will be some fraction of the proposed survey science for which these other surveys will be better than the VLASS.

If one uses metrics that assume the VLASS will necessarily have the lasting community impact of NVSS or FIRST, a very strong case should be made for its absolute uniqueness (how much of the science being proposed can only be done with the VLASS) as well as complementarity of the VLASS with these other surveys (how much of the proposed science can use e.g. EMU/WODAN+VLASS for added value) which will define the legacy of the VLASS.


Just one comment on this: while I agree that the long-term value and impact of a JVLA survey should be one criterion for defining its impact, we have a clear counterexample to the argument that the survey must be "absolutely unique" compared to any existing or planned future survey. Just look at the NVSS and FIRST! They were carried out at the same time, with the same telescope and receivers. They observed at the same frequency. The sky covered by FIRST was also completely covered by NVSS. FIRST is only about 2.5 times deeper than NVSS for point sources, and the sensitivity difference is even less for extended sources.

And yet both FIRST and NVSS have thrived, and both have had a demonstrably large impact on radio and multi-wavelength science. How can that be?

The answer is that the higher resolution of FIRST (with a beam 8 times smaller than NVSS) is essential in doing cross-matches to SDSS and other deep imaging observations. Science with NVSS depends on its larger sky coverage and on the more accurate fluxes that come from a low-resolution survey.

Even though these surveys were carried out and released essentially simultaneously, and even though they had many characteristics in common, it took only one difference --- resolution --- to distinguish them and make them both widely used to this day.

So while I agree that it is important to consider the VLASS survey in the context of current planned surveys and not to duplicate those surveys, I definitely do not agree that it is necessary to push to the extreme limits of the JVLA parameter space (e.g., very high frequencies and the highest possible spatial resolution) in order to distinguish it from coming low-frequency, low-resolution surveys by the SKA pathfinders.


Return to “Programmatics Working Group”

Who is online

Users browsing this forum: No registered users and 1 guest