Skip to content

[Feature Request]: Tags #1200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
reakaleek opened this issue Apr 30, 2025 · 18 comments
Open

[Feature Request]: Tags #1200

reakaleek opened this issue Apr 30, 2025 · 18 comments
Assignees
Labels

Comments

@reakaleek
Copy link
Member

reakaleek commented Apr 30, 2025

Context

The legacy asciidocs build supported tags.
In the conf.yaml, you can see these tags applied at a book level.

These tags were also used to enhance the search experience, making it possible to filter by tags.

Goal

Add a tags feature that can be used by the search feature for filtering.

Requirements

  • The user should be able to set one or more tags for a page.
  • The user should be able to set a default tag for reference content.
    E.g. the logstash reference should have the "Logstash" tag by default
  • The user should be able to append tags on a page, in addition to the default tag.
    Hence, if a default tag exists and the user adds tags to a page, it should contain both the default tag and the additional tags.
  • Linting: Make sure we reuse existing tags.

Proposed implementation

  1. Add a tags frontmatter option; these tags should then be added to HTML meta tags so that the web team can crawl and parse them.
  2. Write a script to parse tags from conf.yaml and add it to frontmatter based on the prefix attribute in conf.yaml and mapped_pages frontmatter in elastic/docs-content. This means a page might have multiple tags.
  3. Do a release so we can already utilize the tags for search.
  4. Add a "default" tag feature for reference content. (Can we add it somehow to assembler.yml or navigation.yml?

Open Questions

  • What format should the meta tags be so that the web team can parse them
  • Can we add additional metadata for the search? E.g., content type, license, or version? AFAIU, right now it's only for "Products" filtering
  • What's the best way to add default tags for reference content? docset.yml, toc.yml, assembler.yml or navigation.yml?

References

Stakeholders

@elastic/docs-engineering, @KOTungseth, @colleenmcginnis, @elastic/webteam

@acsnyder
Copy link

acsnyder commented Apr 30, 2025

There are a few meta tags the search crawler relies on for extracting data from pages; but are missing on the new docs site. The following meta tags have dummy data to illustrate what kind of data is needed.

<meta class="elastic" name="product_version" content="8.17"/>
<meta class="elastic" name="product_name" content="Elasticsearch"/>
<meta class="elastic" name="website_area" content="documentation"/>
<meta name="DC.subject" content="Elasticsearch"/>

The DC.subject meta tag may be duplicative of product_name, we can adjust the crawler if necessary. I created a ticket last year but it might've been added to a deprecated repository.

@zumwalt
Copy link

zumwalt commented Apr 30, 2025

++ to @acsnyder's comment.

Can we add additional metadata for the search? E.g. content type, license, or version?

Yes, we can support any data that you'd like to filter/facet on assuming that we can reliably extract it for all documents

@reakaleek
Copy link
Member Author

reakaleek commented Apr 30, 2025

Thank you for the quick response. @acsnyder In the new Information Architecture (IA), one page could apply to multiple products.

How would we handle this case?

@acsnyder
Copy link

@reakaleek We could have multiple values separated by some kind of delimiter. We'd have to update the crawler script for that. Can you give an example of a page used for multiple products?

@reakaleek
Copy link
Member Author

@KOTungseth, can you please help me find an example page?

@KOTungseth
Copy link
Contributor

@acsnyder
Copy link

acsnyder commented May 1, 2025

Thanks @KOTungseth! The second item (Deploy) is the same link as the source file.

Looking at your mapped_pages frontmatter, it looks like we may need to rethink how we crawl docs pages. I'll comment later when I have some time to check locally.

@bmorelli25
Copy link
Member

bmorelli25 commented May 1, 2025

Another requirement:

@reakaleek
Copy link
Member Author

reakaleek commented May 5, 2025

@colleenmcginnis provided a list of product names and their kebab-case mapping value.

mapping
# I wasn't sure where to put this...
# We can delete it from here if we want to move it to docs-builder.
apm: 'APM'
apm-dotnet-agent: 'APM .NET Agent'
apm-android-agent: 'APM Android Agent'
apm-attacher: 'APM Attacher'
apm-aws-lambda-extension: 'APM AWS Lambda extension'
apm-go-agent: 'APM Go Agent'
apm-ios-agent: 'APM iOS Agent'
apm-java-agent: 'APM Java Agent'
apm-node-agent: 'APM Node.js Agent'
apm-php-agent: 'APM PHP Agent'
apm-python-agent: 'APM Python Agent'
apm-ruby-agent: 'APM Ruby Agent'
apm-rum-agent: 'APM RUM Agent'
beats-logging-plugin: 'Beats Logging plugin'
cloud-control-ecctl: 'Cloud Control ECCTL'
cloud-enterprise: 'Cloud Enterprise'
cloud-hosted: 'Cloud Hosted'
cloud-kubernetes: 'Cloud Kubernetes'
cloud-native-ingest: 'Cloud Native Ingest'
cloud-serverless: 'Cloud Serverless'
cloud-terraform: 'Cloud Terraform'
ecs-logging: 'ECS Logging'
ecs-logging-dotnet: 'ECS Logging .NET'
ecs-logging-go-logrus: 'ECS Logging Go Logrus'
ecs-logging-go-zap: 'ECS Logging Go Zap'
ecs-logging-go-zerolog: 'ECS Logging Go Zerolog'
ecs-logging-java: 'ECS Logging Java'
ecs-logging-node: 'ECS Logging Node.js'
ecs-logging-php: 'ECS Logging PHP'
ecs-logging-python: 'ECS Logging Python'
ecs-logging-ruby: 'ECS Logging Ruby'
elastic-agent: 'Elastic Agent'
ecs: 'Elastic Common Schema (ECS)'
elastic-products-platform: 'Elastic Products platform'
elastic-stack: 'Elastic Stack'
elasticsearch: 'Elasticsearch'
elasticsearch-dotnet-client: 'Elasticsearch .NET Client'
elasticsearch-apache-hadoop: 'Elasticsearch Apache Hadoop'
elasticsearch-cloud-hosted-heroku: 'Elasticsearch Cloud Hosted Heroku'
elasticsearch-community-clients: 'Elasticsearch community clients'
elasticsearch-curator: 'Elasticsearch Curator'
elasticsearch-eland-python-client: 'Elasticsearch Eland Python Client'
elasticsearch-go-client: 'Elasticsearch Go Client'
elasticsearch-groovy-client: 'Elasticsearch Groovy Client'
elasticsearch-java-client: 'Elasticsearch Java Client'
elasticsearch-java-script-client: 'Elasticsearch JavaScript Client'
elasticsearch-painless-scripting-language: 'Elasticsearch Painless scripting language'
elasticsearch-perl-client: 'Elasticsearch Perl Client'
elasticsearch-php-client: 'Elasticsearch PHP Client'
elasticsearch-plugins: 'Elasticsearch plugins'
elasticsearch-python-client: 'Elasticsearch Python Client'
elasticsearch-resiliency-status: 'Elasticsearch Resiliency Status'
elasticsearch-ruby-client: 'Elasticsearch Ruby Client'
elasticsearch-rust-client: 'Elasticsearch Rust Client'
fleet: 'Fleet'
ingest: 'Ingest'
integrations: 'Integrations'
kibana: 'Kibana'
logstash: 'Logstash'
machine-learning: 'Machine Learning'
observability: 'Observability'
reference-architectures: 'Reference Architectures'
search-ui: 'Search UI'
security: 'Security'

This list is incomplete and lacks products for some reference content.
But it's a complete list for docs-content.

@lcawl
Copy link
Contributor

lcawl commented May 6, 2025

It is interesting to ponder whether users will eventually want to filter on content types within the "documentation" area (even if it's just "reference" and "guide" for now). I am mentioning this because we'll also be striving to add matching meta tags in the pages we're hosting on https://www.elastic.co/docs/api/ and it begs the question of whether we want to enable folks to filter those in or out of the search results (in which case it could potentially be nested values like this: "documentation" > "reference" > "API")?

@theletterf
Copy link
Contributor

theletterf commented May 7, 2025

Is this the desired form of the product frontmatter attribute?

products:
  - cloud-hosted
  - cloud-serverless

The mapping list is missing values for EDOT distributions. I'm providing one here following the convention established in the previous comments:

Additional mappings for EDOT
edot-collector: 'Elastic Distribution of OpenTelemetry Collector'
edot-java: 'Elastic Distribution of OpenTelemetry Java'
edot-dotnet: 'Elastic Distribution of OpenTelemetry .NET'
edot-nodejs: 'Elastic Distribution of OpenTelemetry Node.js'
edot-php: 'Elastic Distribution of OpenTelemetry PHP'
edot-python: 'Elastic Distribution of OpenTelemetry Python'
edot-android: 'Elastic Distribution of OpenTelemetry Android'
edot-ios: 'Elastic Distribution of OpenTelemetry iOS'

@reakaleek reakaleek self-assigned this May 7, 2025
@reakaleek
Copy link
Member Author

reakaleek commented May 9, 2025

@acsnyder, we added the feature in #1226

This will add all the Product names delimited by , (comma).

e.g.

<meta class="elastic" name="product_name" content="Elasticsearch,Kibana"/>
<meta name="DC.subject" content="Elasticsearch,Kibana"/>

However, we have not yet added the frontmatter attributes to the markdown pages.

This will be handled in elastic/docs-content#1336

KOTungseth added a commit to elastic/docs-content that referenced this issue May 14, 2025
⚠️ **This PR is dependent on
elastic/docs-builder#1256 being merged and
changes being released.**

Related to #1336
elastic/docs-builder#1200

Adds new `products` frontmatter, which will be used to generate metadata
during the build process that the web team will use in the search
experience (so users can filter by product).

_Note: This reduces the scope of
#1336 to only include
`products` tags in the frontmatter to unblock updates to the search
experience. I'll open a separate PR to update `applies_to` and tag
writers to review for their area._

Here's the process:

* **Map AsciiDoc/v3 products**: @KOTungseth created a list of all
AsciiDoc books mapped to product names.
* **Add frontmatter**: I wrote a script that uses Kaarina's list to look
at each Markdown file and assign `products` associated with the AsciiDoc
book(s) included in `mapped_pages`.
* **Format frontmatter**: I standardized the order and format of the
frontmatter items.
* **Validate frontmatter**: I created
[`frontmatter.config.yml`](https://github.com/elastic/docs-content/blob/add-product-tags/frontmatter.config.yml)
and checked against it to make sure all frontmatter keys and product
values are valid. (Note: I haven't checked that all the values of
`deployment` and `serverless` options are valid/correctly formatted.)

cc @KOTungseth @reakaleek @zumwalt

---------

Co-authored-by: Kaarina Tungseth <[email protected]>
colleenmcginnis added a commit to elastic/apm-agent-dotnet that referenced this issue May 15, 2025
Related to elastic/docs-builder#1200

Add `products` to `docset.yml` to be used in the search experience.

cc @KOTungseth
colleenmcginnis added a commit to elastic/ecs-dotnet that referenced this issue May 15, 2025
Related to elastic/docs-builder#1200

Add `products` to `docset.yml` to be used in the search experience.

cc @KOTungseth
florent-leborgne pushed a commit to elastic/kibana that referenced this issue May 16, 2025
Related to elastic/docs-builder#1200

Add `products` to `docset.yml` to be used in the search experience.

cc @KOTungseth
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue May 16, 2025
Related to elastic/docs-builder#1200

Add `products` to `docset.yml` to be used in the search experience.

cc @KOTungseth

(cherry picked from commit 266def5)
@KOTungseth
Copy link
Contributor

Can we add additional metadata for the search? E.g., content type, license, or version? AFAIU, right now it's only for "Products" filtering

We should also support the following tags:

  • Version: This requires indexing of all versions of Docs, including AsciiDoc
  • Content type: guide, reference, release notes, troubleshooting, api
  • Language: Python, Java, Go, JavaScript, .NET, PHP, Ruby, Rust, etc
  • Offerings and subscriptions: Based on https://www.elastic.co/pricing

@reakaleek
Copy link
Member Author

reakaleek commented May 20, 2025

@KOTungseth I have some questions.

Version: This requires indexing of all versions of Docs, including AsciiDoc

For AsciiDoc, I guess in AsciiDoc that would relatively straighforward because the version is also in the URL. Hence, we need to somehow get that variable in add is a a meta tag.

But what version do we put on V3? Should it only be certain stack versions?

Content type: guide, reference, release notes, troubleshooting, api

I guess we need to set this at toc.yml level and not in docset.yml. WDYT @Mpdreamz?

Language: Python, Java, Go, JavaScript, .NET, PHP, Ruby, Rust, etc

This could be at the docset.yml and frontmatter, right?

Offerings and subscriptions: Based on https://www.elastic.co/pricing

At which level do we need this information? per page? docset?

@reakaleek
Copy link
Member Author

Generally, we might need to redesign this feature to accommodate all the needs.

This initial version was done ASAP, to fix the existing search problems.

@acsnyder
Copy link

Thanks everyone for your work on adding product tags! We've added them into the ingest script and updated the search front-end

@reakaleek
Copy link
Member Author

Thank you @acsnyder!

@colleenmcginnis
Copy link
Contributor

colleenmcginnis commented May 22, 2025

As of now, most of the PRs adding product metadata to repos containing docs are merged, but there are still a few PRs open to add product tags. @KOTungseth could you help with some reviews/approvals? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants