1 Introduction

Recently, Anders et al. (2021) showed that the population of Milky Way open clusters older than \(\sim \)1 Gyr, located closer than 2 kpc from the Sun, is much smaller than previously known, which brings important implications to our comprehensive knowledge of the formation and evolution of the milky way open-cluster system. They arrived at this result from the comparison of their derived open-cluster age function with that obtained from the milky way star-cluster catalog (MWSC; Kharchenko et al. 2013; Piskunov et al. 2018; Krumholz et al. 2019). The decrease in the number of open clusters that are older than \(\sim \)1 Gyr in the Anders et al. (2021)’s sample comes from the fact that they could not confirm as genuine open clusters, many of those, which are included in the MWSC catalog in that age range. Anders et al. (2021)’s compilation includes 268 open clusters older than 1 Gyr (also found in the MWSC catalog); an amount, which is nearly twice the number of old open clusters in the MWSC catalog not detected by them. The fundamental parameters (age, distance, metallicity, etc.) of the latter have been extensively used in studies of the open cluster system and the milky way disk (e.g., Joshi et al. 2016; Kharchenko et al. 2016; Dib et al. 2018), which means that they have been considered as real physical systems.

As mentioned by Cantat-Gaudin et al. (2018), on which Anders et al. (2021) based their work, there are some reasons that explain the lack of detection of open clusters in their analysis, among them, the density of the background, the interstellar extinction, the cluster’s star richness, the difference of proper motion between the cluster and the field, the cluster age, etc. Indeed, older open clusters, and particularly, those more distant, contain main sequence stars fainter than the magnitude limit used for detecting open clusters in the Anders et al. (2021)’s sample. Therefore, it arises necessary to revisit those non-detected clusters in the MWSC catalog with the aim of examining their physical nature.

In contrast with this resulting decrease in the number of confirmed old open clusters, evidence of enhanced cluster formation episodes with two primary excesses at \(\sim \)10–15 Myr and 1.5 Gyr were found by Piatti (2010). He used 1787 open clusters from the Dias et al. (2002)’s catalog, and confirmed both age peaks when restricting the sample of open clusters to those located in the solar neighborhood, with the aim of avoiding incompleteness effects. Moreover, recent search for new open clusters based on the Gaia DR3 database (Gaia Collaboration et al. 2016; Babusiaux et al. 2022) and a variety of machine-learning techniques have found many new open clusters older than 1 Gyr (see, for instance, Kounkel et al. 2020; Castro-Ginard et al. 2022; Hao et al. 2022; Qin et al. 2023, among others). In this sense, previous efforts in assessing the physical reality of open clusters showed the importance of dealing with a statistically complete sample of genuine open clusters for studies of the recovery of open-cluster formation history and their destruction rate, the structure of the Milky Way disk, etc. (Piatti et al. 2011; Piatti 2017; Dias et al. 2021).

Precisely, the main goal of this paper is to reanalyse the sample of open clusters older than 1 Gyr included in the MWSC catalog that are not present in the open cluster age function that obtained by Anders et al. (2021), with the aim of providing a robust assessment on their physical nature. We described the analysis strategy in Section 2 and discussed the derived results in Section 3. In Appendix, we included all the supporting material to ease the reading of the text.

Table 1 Literature search for selected MWSC catalog’s open clusters. Open clusters are listed according to their literature’s ages in descendent order.

2 Data analysis

To compile a list of open clusters included in the MWSC catalog with ages larger than 1 Gyr, not detected by Anders et al. (2021), we compared both catalogs using the IRAFFootnote 1ttools.tdiffer task, which creates an output table that includes only the rows that differ between two input tables. We note that the Anders et al. (2021)’s sample consists of open clusters located closer i.e., 2.0 kpc from the Sun, so that we constrained our analysis to those clusters. As a comparison variable, we employed the clusters’ names, and from the resulting list of clusters, we selected those older than 1 Gyr, which turned out to be 136 open clusters. Before cross-matching the tables, which are very well manageable, we took care of spaces, underscores, different abbreviations, multiple names, etc., so that, we are confident that in this case, cross-matching names was more secure than using coordinates, etc.

We then searched the literature seeking for detailed studies independent from Kharchenko et al. (2013) and Anders et al. (2021), focused on these 136 open clusters. This is to guarantee a third party analysis that gives us an independent assessment on the selected open clusters. We found 17 open clusters that comply with that precepts. They are listed in Table 1 with the respective references. As can be seen, the detailed study of open clusters represent \(\sim \)13% of the whole sample (136). This shows that most of the 136 selected open clusters have not been studied in detail other than by Kharchenko et al. (2013) and/or Anders et al. (2021), which justifies to be embarked in this work.

Table 1 shows the previous detailed works on some open clusters that have confirmed their physical nature. We rely on these works as a support for the existence of these objects as real open clusters older than 1 Gyr. From this point of view, we assume that the lack of detection of them by Anders et al. (2021) could be caused by some of the reasons described in Cantat-Gaudin et al. (2018). Nevertheless, there is one object, ESO 436-02, which was also discarded by Piatti et al. (2017) as a genuine star aggregate (Table 1). We think that if more detailed studies of MWSC open clustesr were carried out, some of them could be confirmed as real physical systems. Only 5 out of 10 old open clusters in Table 1 (Bica 6, ESO 425-15, ESO 447-29, ESO 552-05, NGC 7036) are located within 2.0 kpc from the Sun (see references in Table 1), so that they represent nearly a 2% increase in the Anders et al. (2021)’s old open cluster sample. The remaining old open clusters in Table 1 have heliocentric distances from 2.3 up to 6.3 kpc, with an average of 4.0 kpc. We note that their heliocentric distances in the MWSC catalog are smaller than 2.0 kpc.

The age estimates of open clusters older than 1 Gyr in Table 1, although derived from different studies, are in a general agreement with those of the MWSC catalog. We obtained an average value and dispersion of \((\log (\textrm{age})_{\textrm{our}} - \log (\textrm{age})_{\textrm{MWSC}}) = 0.02\pm 0.21\). However, we found that 6 MWSC old open clusters resulted to be younger from detailed independent works. We think that this discrepancy arises from constraints in the star field decontamination procedure of open cluster color–magnitude diagrams (CMDs) (Kharchenko et al. 2013), which could mislead the fitting of theoretical isochrones. This is also the case of not confirmed open cluster ESO 436-02.

2.1 HDBSCAN analysis

From the above analysis, there are still 119 open clusters, which are not included in the compilation of Anders et al. (2021), for which the MWSC catalog provides age estimates larger than 1 Gyr. As far as we are aware, these objects do not have any independent studies in the literature. Because of the confirmation of these objects as old open clusters is important to have a comprehensive knowledge of the older end of the milky way open cluster age distribution, we decided to analyse them using independent data sets and analysis methods.

We retrieved from the Gaia DR3 (Gaia Collaboration et al. 2016; Babusiaux et al. 2022) database R.A. and Dec. coordinates, parallaxes (\(\varpi \)), proper motions in R.A. and Dec. (pmra, pmdec), with their associated uncertainties and G, BP and RP magnitudes of stars located inside circles with a radius of 30 arcmin from the centers of these 119 open clusters. We filtered the data following the recommendations described by Cantat-Gaudin et al. (2018) and imposed the following cuts: \(G< 18\) mag and |pmra|, |pmdec| <30 mas yr\(^{-1}\) (Hunt & Reffert 2021).

A real open cluster is featured by being a spatial stellar overdensity, composed of stars located at a nearly same distance from the Sun and sharing a mean motion. These conditions can be used by any clustering search engine to identify open clusters in the Gaia DR3 database. We used the recommended hierarchical density-based spatial clustering of applications with noise (HDBSCAN, Campello et al. 2013) Gaussian mixture model technique (Hunt & Reffert 2021) to search for overdensities in the 5D-phase space defined by R.A., Dec., \(\varpi \), pmra and pmdec. The min_cluster_size parameter was varied between 4 and 15 dex in steps of 1 dex, and from each output, we built diagnostic plots as illustrated in Figure 1, where HDBSCAN identified stellar groups. We colored the points according to the clusterer.labels_ parameter, which labels the different identified groups of stars in the 5D-phase space. As can be seen, not only a group of stars is identified close to the (\(\Delta (\mathrm{R.A.})\times \cos (\mathrm{Dec.})\), \(\Delta (\mathrm{Dec.})) = (0,0)\) (centered on the object), but also across the searched field.

Fig. 1
figure 1

Diagnostic diagrams from the HDBSCAN analysis for stars in the field of NGC 1520, for which HDBSCAN identified an unphysical stellar system. Colored points correspond to nine different HDBSCAN groups of stars in the 5D-phase space.

The number of groups and the stars included in them can vary with the min_cluster_size parameter. Therefore, we visually inspected the four panels of Figure 1 looking for the optimum min_cluster_size value towards which the clusterer.labels_ values remain constant and with similar star distributions. Each clusterer.labels_ value corresponds to a particular group of stars in Figure 1. Once we chose a group of stars, and because of its respective clusterer.labels_ value, we built Figure 2 for all its stars, which shows the distribution of the selected 5D-phase space clustered stars in four different plots. HDBSCAN also provides the membership probability of each star to the corresponding group. For the sake of reader, Figure 2 illustrates an example of a group of stars not confirmed as an open cluster (Table 2). If the chosen group of stars shows the expected small dispersion in the vector point diagram (\(\Delta \)(pmra), \(\Delta \)(pmdec)) \(\sim \)(1 mas yr\(^{-1}\), 1 mas yr\(^{-1}\)) (Hunt & Reffert 2021); a relative constant trend with \(\varpi \) in the \(\varpi \) vs. G diagram; and a CMD with star sequences that suggests the presence of an old open cluster, we selected that object as a possible candidate for a further detailed analysis. Figures 3 and 4 show the number of stars for each candidate cluster, colored according to their membership probabilities.

Fig. 2
figure 2

Diagnostic diagrams for a selected group of stars in the field of NGC 1520 chosen from Figure 1. Colored points represent different membership probabilities. Gray dots represent the whole Gaia DR3 data sets used. The object was discarded as an open cluster because it does not show the expected small dispersion in the vector point diagram (\(\Delta \)(pmra), \(\Delta \)(pmdec)) \(\sim \) (1 mas yr\(^{-1}\), 1 mas yr\(^{-1}\)), (see top-right panel), neither a relative constant trend with \(\varpi \) in the \(\varpi \) vs. G diagram, nor a CMD with star sequences that suggests the presence of an old open cluster. NGC 1520 was removed from the NGC catalog by Sulentic et al. (1973), as pointed out by Cantat-Gaudin & Anders (2020), who also concluded that it is not an open cluster. Symbol size in the top-left panel is proportional to the star brightness.

Table 2 lists the names of the objects that could not be confirmed as old open clusters, because their diagnostic plots do not satisfy the above requirements. We are aware of the uncertainties in the Gaia DR3 data sets, particularly in the \(14 \le G\) (mag) \(\le \)18 range used, namely: \(\sigma (\varpi ) \le 0.2\) mas; \(\sigma \mathrm{(pmra, pmdec)} \le 0.2\) mas yr\(^{-1}\); \(\sigma (G) \le 0.01\) mag, \(\sigma \)(BP, RP) \(\le \) 0.02 mag (Babusiaux et al. 2022; Gaia Collaboration et al. 2022). These uncertainties do not affect the assessments made on the diagnostic plots (see, e.g., Figure 2). As an exercise, we took into account the parallax uncertainties (parallax is the most uncertain parameter) for all the examined open clusters to extensively test HDBSCAN by using parallax values generated randomly from a normal distribution using the respective mean values and associated errors. For the individual executions, we repeated the above analysis and recovered min_cluster_size and clusterer.labels_ values that led us to conclude on the same object status found previously. Table  2 lists 110 objects, which represent nearly 80% of the open clusters older than 1 Gyr cataloged by Kharchenko et al. (2013), which were not included in the compilation by Anders et al. (2021).

Table 2 Open clusters in the MWSC catalog not included in the Anders et al. (2021)’s compilation and not confirmed as older than 1 Gyr in this work by running HDBSCAN as a diagnostic test.nnnnn
Fig. 3
figure 3

Same as Figure 2 for FSR 0851. The \(\varpi \) vs. G plot shows the mean and standard deviation of \(\varpi \) drawn with solid and dashed lines, respectively. The CMD shows the best fitted isochrone superimposed.

There are still nine remaining objects from the HDBSCAN analysis, whose diagnostic plots hint at the possibility of being open clusters older than 1 Gyr. When running HDBSCAN, proper motions resulted to be the variables with much clearer clustering; in most of the cases with points’ dispersion smaller than a couple of mas yr\(^{-1}\). Moreover, HDBSCAN identified only one group of points in the vector point diagrams for each one of these objects. All the stars considered, resulted to be spatially distributed in different groups (Figure 1), which suggests that proper motions alone cannot be used as a driven parameter to detect open clusters. Jaehnig et al. (2021) recently based their discovery of new 11 open clusters on vector point diagrams’ overdensities using Gaia DR2 data sets. The 11 new objects exhibit CMD star sequences resembling those of open clusters. However, Piatti et al. (2023) showed that the dispersion of their fundamental properties (age, distance, reddening and metallicity) turned out to be much larger than those usually obtained for open clusters. Indeed, they resemble those of ages and metallicities of composite star field populations, or possibly sparse groups of stars. This result prevent us of using proper motions as a main drivers for identifying real open clusters.

Parallaxes resulted the less clustered variable, while running HDBSCAN, except in the cases of the stars of the nine aforementioned open clusters. Nevertheless, their \(\varpi \) vs. G diagrams need some additional cleaning of interlopers; mainly stars located in the line of sight towards the open clusters with proper motions similar to that of cluster members.

2.2 Color–magnitude diagram analysis

The CMDs of these nine open clusters also show the presence of field stars. We used Figures 3 and 4 to further analyse these objects. First, we derived the mean and dispersion of the open cluster parallaxes, which are represented by solid and dashed lines in the \(\varpi \) vs. G diagrams (bottom-right panels), respectively. From these parallaxes and the open clusters’ central coordinates, we obtained mean reddening and dispersion using different milky way reddening map models provided through the GALExtinFootnote 2 interface (Amôres et al. 2021).

Fig. 4
figure 4

Same as Figure 3 for analysed MWSC open clusters older than 1 Gyr. The \(\varpi \) vs. G plot shows the mean and standard deviation of \(\varpi \) drawn with solid and dashed lines, respectively. The CMD shows the best fitted isochrone superimposed.

Table 3 Derived properties for the studied open clusters.

To decontaminate field stars in the cluster CMDs we relied on the procedure devised by Piatti & Bica (2012), which has been shown to produce cleaned cluster CMDs (e.g., Piatti & Lucchini 2022; Piatti 2022 and references therein). The method consists of comparing the cluster CMD with the CMD of a reference star field located adjacent to the cluster region and subtracting from the cluster CMD the closest stars to those in the respective reference star field CMD. Figure 3 illustrates the cleaned diagnostic plots for FSR 0851. The field star cleaned diagnostic plots for the remaining 8 open clusters are depicted in Appendix (see Figure 4).

We used the independent measures of distance and reddening to guide the automated stellar cluster analysis code (ASteCA, Perren et al. 2015) in deriving the clusters’ ages and metallicities. ASteCA explores the parameter space of synthetic CMDs through the minimization of the likelihood function defined by Tremmel et al. (2013, the Poisson likelihood ratio (Equation 10)) using a parallel tempering Bayesian MCMC algorithm and the optimal binning by Knuth (2018)’s method. To generate the synthetic CMDs, ASteCA uses the theoretical isochrones computed by Bressan et al. (2012, PARSEC v1.2SFootnote 3), the initial mass function of Kroupa (2002) and cluster masses in the range of 100–5000 \(M_\odot \), whereas binary fractions are allowed in the range of 0.0–0.5 with a minimum mass ratio of 0.5. Table 3 lists the resulting cluster astrophysical properties, while Figures 3 and 4 show the respective theoretical isochrones superimposed onto the cleaned cluster CMDs.

The nine analysed objects resulted to be old open clusters; the mean difference between the present values and the ages listed in the MWSC catalog being \((\log (\textrm{age})_{\textrm{our}} - \log (\textrm{age})_{\textrm{MWSC}}) = -0.14 \pm 0.12\). Albeit they are old open clusters, their derived heliocentric distances resulted to be larger than 2.0 kpc, so that they cannot be added to the present comparison between Anders et al. (2021)’s compilation and the MWSC catalog. Heliocentric distances larger than 2.0 were also derived for half of the old open clusters listed in Table 1 with detailed independent studies (MWSC distances <2.0 kpc). By inspecting the cluster CMDs with superimposed theoretical isochrones shifted using the MWSC distances, we found that they match very well the sequences of field stars. This means that contamination of field stars is at some level present in the cluster CMDs used by Kharchenko et al. (2013). Perhaps, this contamination may also explain the recognition of many objects as open clusters that Gaia data combined with machine-learning techniques could not be able to recover them (see, e.g., Table 2).

3 Discussion and concluding remarks

The present detailed analysis of 136 open clusters included in the MWSC catalog, carried out from a dedicated HDBSCAN clustering search and a powerful technique for the decontamination of field stars in the cluster CMDs, shows that they all are not older than 1 Gyr. We found that 19 out of the 136 objects analysed are real old open clusters; the remaining ones (117) being younger open clusters (6) or not confirmed physical systems (111). Five out of the 19 confirmed old open clusters are located inside a circle of 2 kpc from the Sun. They represent an increase of \(\sim \)2% in the compilation of clusters older than 1 Gyr by Anders et al. (2021). This outcome shows that detailed studies are necessary to disentangling the real nature of cataloged open clusters, as well as that the MWSC catalog contains a large percentage (29%) of non-real stellar aggregates among those with assigned ages larger than 1 Gyr. In brief, the MWSC catalog supersedes by those built from Gaia data.

Nevertheless, we also note that Hunt & Reffert (2023) showed that there is a number of limitations of HDBSCAN and in Gaia data, and differences in the quality cuts and definitions of an open cluster that can lead us to be unable to detect some open clusters. Besides, a dedicated procedure to clean the field star contamination in cluster CMDs is also needed. Indeed, while performing an all-sky census of open clusters using the Gaia DR3 database and HDBSCAN, they did not detect 1152 open cluster in the MWSC catalog and built an open cluster age function which, for ages larger than 1 Gyr, includes even less clusters than Anders et al. (2021). They also show that the open cluster age distribution constructed by Kounkel et al. (2020) from Gaia data for that age range is in excellent agreement with that by Kharchenko et al. (2013) (see their Figure 11), which seems somehow paradoxical. Furthermore, other studies based on the Gaia database have even found more old open clusters (He et al. 2022a, b, 2023) and Hao et al. (2022), which in turn, resulted in open cluster age distributions with an excess of old open clusters with respect to that built by Hunt & Reffert (2023, see their Figure 15).

We performed a search of the nine open clusters confirmed in this work (see Table 3) in the catalogs built by Kounkel et al. (2020), He et al. (2022a, 2022b, 2023) and Hunt & Reffert (2023) and found none of them. This outcome reinforces the verdict that the different clustering search methods can produce different outcomes.

Previous discrepancies about the completeness of the known old open cluster population, and hence, of its age-distribution function, are also found in the literature. For instance, Kharchenko et al. (2013) claimed that the MWSC catalog is almost complete at 1.8 kpc from the Sun, except possibly for clusters older than 1 Gyr, a result that was used as a support by Joshi et al. (2016) in their analysis of the Galactic structure and to conclude on different relationships between the cluster’s mass, age and diameter. On the other hand, Anders et al. (2021) estimated a completeness of \(\sim \)88% of their old open cluster sample based on open cluster recovery experiments. They performed those experiments using the catalog of Castro-Ginard et al. (2020) as a reference, which was also built from Gaia DR2 data. Curiously, both Kharchenko et al. (2013) and Anders et al. (2021) mentioned that their cluster samples are almost complete, although the former include \(\sim \)50% more old open clusters than the latter. The above results show that our knowledge of the old end of the open cluster population is far from being complete, so that definitive conclusions on their properties could be risky to draw at the present completeness of the old open cluster population.

We think that some of the present constraints of building a statistically complete old open cluster age-distribution function in the solar neighborhood could be mitigated from deeper imaging surveys, which could help identifying uncovered old open clusters. From currently available imaging surveys, recent works applying machine-learning techniques (see pros and cons of different methods in Hunt & Reffert 2021) to identify open clusters, and particularly new discoveries (see a summary compiled in Table 3 in Hunt & Reffert 2023), present mainly open clusters with relatively long main sequences (>6 mag long). However, distant old open clusters do not show long main sequences down to \(G = 18\) mag (Figures 3 and 4), which is the most common limiting magnitude used when dealing with Gaia data.

In summary, we did not find any remarkable difference between the population of open clusters older than 1 Gyr in Anders et al. (2021) and in its counterpart in the MWSC catalog, although there are still doubts about their completeness. As mentioned above. other recent searches for new open clusters have found more old objects using Gaia data also, which poses the issue about the different performances of the devised detection procedures, including machine-learning methods and cleaning of the field stars in their CMDs. Therefore, a comprehensive solar neighborhood old open cluster age-distribution function is still under construction, and will require much more effort focused, among others, on deeper imaging surveys. From this point of view, the claims by Anders et al. (2021) about a remarkable earlier drop of the old open cluster population should be considered to the light of the above mentioned challenges. We think that the careful analysis carried out in this work sheds light into the advantages and disadvantages of different data sets and analysis procedures. As far as we are aware, they have been highlighted in the literature unevenly.