21 August, 2020

Elastic (part 1)

Over the last couple days I've started using Elasticsearch as part of an internal application.

From 0 to novice, there's a number of stumbling blocks that aren't really clear in the documentation. I hope this helps others frantically googling for these same issues 😊

Tutorials prior to version 7 don't work (without modification)

If you come across a tutorial or code snippet that uses _all, _default_, or _doc, it's not going to work. Those stopped working after Elasticsearch 6, with the removal of mapping types.

'Limit of total fields [1000] in index [index-name] has been exceeded'

By default, Elasticsearch indexes all fields in your document. Therefore, if you have data that has dynamic field names, you may not want to have all of them indexed.

To stop Elasticsearch from indexing all fields (i.e., if you just want to retreive documents by their ID), before you upload data, you can set the mappings to only index certain fields (properties):

{
    "mappings": {
        "enabled": false
    }
}

If you're using the Python API to load the data, you can use native dicts - you don't need to use JSON strings:

index_properties = {
    'mappings': {
        'enabled': False,
    }
}

es.indices.create("cool-index", body=index_properties)

Also, it doesn't look like you can currently exclude all fields except certain fields, so something like this doesn't work (please, tweet me @jlaundry if this does start working):

{
    "mappings": {
        "enabled": false,
        "properties": {
            "field_i_want_to_index": {
                "enabled": true
            }
        }
    }
}

"Can't merge a non object mapping [client.ip] with an object mapping [client.ip]" and "Can't merge a non object mapping [server.ip] with an object mapping [server.ip]"

This happens because the dotted notation used in the ECS syntax needs to be fully expanded into objects.

For example, when creating your index and setting types, don't do this:

{
    "mappings": {
        "properties": {
            "client.address": {
                "type": "keyword"
            },
            "client.ip": {
                "type": "ip"
            },
            "client.port": {
                "type": "long"
            }
        }
    }
}

Do this instead:

{
    "mappings": {
        "properties": {
            "client": {
                "properties": {
                    "address": {
                        "type": "keyword"
                    },
                    "ip": {
                        "type": "ip"
                    },
                    "port": {
                        "type": "long"
                    }
                }
            }
        }
    }
}