Python Developer Track for Oracle JSON and Duality Views – Part 5: Indexing

In this tutorial we are going to learn about indexing. We can create different indexes with different purposes:

  1. Create an index to improve query performance
  2. Create a search index to search for keywords
  3. Create an index to get a Data Guide, or get the JSON structure

Let’s go with the first index. As usual, first we are going to connect to the Oracle Database 23c and open our collection “hotel_reservations”:

import oracledb
import json
#connect with thick-client
oracledb.init_oracle_client()
connection = oracledb.connect(user="myapp",
password="PassworD123##",
dsn="localhost:1521/FREEPDB1")
#auto commit
connection.autocommit = True
#Open collection for hotel reservations
soda = connection.getSodaDatabase()
collection = soda.openCollection("hotel_reservations")
import oracledb
import json

#connect with thick-client
oracledb.init_oracle_client()
connection = oracledb.connect(user="myapp",
                              password="PassworD123##",
                              dsn="localhost:1521/FREEPDB1")
#auto commit
connection.autocommit = True

#Open collection for hotel reservations
soda = connection.getSodaDatabase()
collection = soda.openCollection("hotel_reservations")
import oracledb import json #connect with thick-client oracledb.init_oracle_client() connection = oracledb.connect(user="myapp", password="PassworD123##", dsn="localhost:1521/FREEPDB1") #auto commit connection.autocommit = True #Open collection for hotel reservations soda = connection.getSodaDatabase() collection = soda.openCollection("hotel_reservations")

Enter fullscreen mode Exit fullscreen mode

Now we have to define the index. In this case I want to create a composite index, as I would like to search by the reservation_id and the room_id. We are going to call the index “reservation_index”. Then we will call the function “createIndex”:

index_def = {
'name': 'RESERVATION_INDEX',
'fields': [
{
'path': 'reservation_id',
'datatype': 'string',
'order': 'asc'
},
{
'path': 'room_id',
'datatype': 'string',
'order': 'asc'
},
]
}
collection.createIndex(index_def)
index_def = {
    'name': 'RESERVATION_INDEX',
    'fields': [
        {
            'path': 'reservation_id',
            'datatype': 'string',
            'order': 'asc'
        },
        {
            'path': 'room_id',
            'datatype': 'string',
            'order': 'asc'
        },

    ]
}
collection.createIndex(index_def)
index_def = { 'name': 'RESERVATION_INDEX', 'fields': [ { 'path': 'reservation_id', 'datatype': 'string', 'order': 'asc' }, { 'path': 'room_id', 'datatype': 'string', 'order': 'asc' }, ] } collection.createIndex(index_def)

Enter fullscreen mode Exit fullscreen mode

Easy, isn’t it? Now let’s create our second index. Let me show you the code first:

index_search_def ={
"name" : "SEARCH_AND_DATA_GUIDE_IDX",
"dataguide" : "on",
"search_on" : "text" }
collection.createIndex(index_search_def)
index_search_def ={ 
    "name"      : "SEARCH_AND_DATA_GUIDE_IDX",
    "dataguide" : "on",
    "search_on" : "text" }

collection.createIndex(index_search_def)
index_search_def ={ "name" : "SEARCH_AND_DATA_GUIDE_IDX", "dataguide" : "on", "search_on" : "text" } collection.createIndex(index_search_def)

Enter fullscreen mode Exit fullscreen mode

In this case, the index is called “search_and_data_guide_idx”. As the name describes, it has double purpose. There is a first parameter which is called “dataguide”. Data guide provides a summary of the structural and type information about our JSON documents. Later on we will have a look into it. This parameter can be on or off.

The second parameter is “search_on”. This parameter can have two values: text or text_value. If your search index involve only text search or string-equality search you can use text. If you are interested in numeric ranges as well as text, you have to use text_value.

Now that we have defined the index, let’s do a query where we are looking for bookings that are talking about checking, because the customer had added a comment into the booking:

documents = collection.find().filter({"$textContains" : "check%"}).getDocuments()
print('\n Found the following documents talking about check:')
for d in documents:
content = d.getContent()
print(content)
documents = collection.find().filter({"$textContains" : "check%"}).getDocuments()
print('\n Found the following documents talking about check:')
for d in documents:
    content = d.getContent()
    print(content)
documents = collection.find().filter({"$textContains" : "check%"}).getDocuments() print('\n Found the following documents talking about check:') for d in documents: content = d.getContent() print(content)

Enter fullscreen mode Exit fullscreen mode

We can see we have a reservation with the following comment: Please arrange for a 2pm checkout.

As promised before, let’s have a look at the Data Guide. One of the great benefits of JSON is the flexibility, and could be also one of the problems! If we want to know if there are any document which is not following the expected structure or if I want to know how it is, we can get this data guide. Let’s have a look:

data_guide= collection.getDataGuide().getContent()
print(json.dumps(data_guide, indent=1))
data_guide= collection.getDataGuide().getContent()
print(json.dumps(data_guide, indent=1))
data_guide= collection.getDataGuide().getContent() print(json.dumps(data_guide, indent=1))

Enter fullscreen mode Exit fullscreen mode

You should get the following JSON document. We can get all the info about number of fields, length and preferred column name.

>>> print(json.dumps(data_guide, indent=1))
{
"type": "object",
"o:length": 1,
"properties": {
"_id": {
"type": "string",
"o:length": 32,
"o:preferred_column_name": "DATA$_id"
},
"room_id": {
"type": "string",
"o:length": 4,
"o:preferred_column_name": "DATA$room_id"
},
"hotel_id": {
"type": "string",
"o:length": 4,
"o:preferred_column_name": "DATA$hotel_id"
},
"guest_name": {
"type": "object",
"o:length": 1,
"o:preferred_column_name": "DATA$guest_name",
"properties": {
"last_name": {
"type": "string",
"o:length": 16,
"o:preferred_column_name": "DATA$last_name"
},
"first_name": {
"type": "string",
"o:length": 8,
"o:preferred_column_name": "DATA$first_name"
}
}
},
"num_adults": {
"type": "number",
"o:length": 1,
"o:preferred_column_name": "DATA$num_adults"
},
"total_cost": {
"type": "number",
"o:length": 4,
"o:preferred_column_name": "DATA$total_cost"
},
"checkin_date": {
"type": "string",
"o:length": 16,
"o:preferred_column_name": "DATA$checkin_date"
},
"num_children": {
"type": "number",
"o:length": 1,
"o:preferred_column_name": "DATA$num_children"
},
"checkout_date": {
"type": "string",
"o:length": 16,
"o:preferred_column_name": "DATA$checkout_date"
},
"payment_status": {
"type": "string",
"o:length": 4,
"o:preferred_column_name": "DATA$payment_status"
},
"reservation_id": {
"type": "string",
"o:length": 1,
"o:preferred_column_name": "DATA$reservation_id"
},
"guest_contact_info": {
"type": "object",
"o:length": 1,
"o:preferred_column_name": "DATA$guest_contact_info",
"properties": {
"email": {
"type": "string",
"o:length": 32,
"o:preferred_column_name": "DATA$email"
},
"phone": {
"type": "string",
"o:length": 8,
"o:preferred_column_name": "DATA$phone"
},
"address": {
"type": "object",
"o:length": 1,
"o:preferred_column_name": "DATA$address",
"properties": {
"city": {
"type": "string",
"o:length": 16,
"o:preferred_column_name": "DATA$city"
},
"country": {
"type": "string",
"o:length": 8,
"o:preferred_column_name": "DATA$country"
}
}
}
}
},
"additional_requests": {
"type": "array",
"o:length": 1,
"o:preferred_column_name": "DATA$additional_requests",
"items": {
"properties": {
"type": {
"type": "string",
"o:length": 16,
"o:preferred_column_name": "DATA$type"
},
"details": {
"type": "string",
"o:length": 64,
"o:preferred_column_name": "DATA$details"
},
"quantity": {
"type": "number",
"o:length": 1,
"o:preferred_column_name": "DATA$quantity"
}
}
}
}
}
}
>>>
>>> print(json.dumps(data_guide, indent=1))
{
 "type": "object",
 "o:length": 1,
 "properties": {
  "_id": {
   "type": "string",
   "o:length": 32,
   "o:preferred_column_name": "DATA$_id"
  },
  "room_id": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$room_id"
  },
  "hotel_id": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$hotel_id"
  },
  "guest_name": {
   "type": "object",
   "o:length": 1,
   "o:preferred_column_name": "DATA$guest_name",
   "properties": {
    "last_name": {
     "type": "string",
     "o:length": 16,
     "o:preferred_column_name": "DATA$last_name"
    },
    "first_name": {
     "type": "string",
     "o:length": 8,
     "o:preferred_column_name": "DATA$first_name"
    }
   }
  },
  "num_adults": {
   "type": "number",
   "o:length": 1,
   "o:preferred_column_name": "DATA$num_adults"
  },
  "total_cost": {
   "type": "number",
   "o:length": 4,
   "o:preferred_column_name": "DATA$total_cost"
  },
  "checkin_date": {
   "type": "string",
   "o:length": 16,
   "o:preferred_column_name": "DATA$checkin_date"
  },
  "num_children": {
   "type": "number",
   "o:length": 1,
   "o:preferred_column_name": "DATA$num_children"
  },
  "checkout_date": {
   "type": "string",
   "o:length": 16,
   "o:preferred_column_name": "DATA$checkout_date"
  },
  "payment_status": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$payment_status"
  },
  "reservation_id": {
   "type": "string",
   "o:length": 1,
   "o:preferred_column_name": "DATA$reservation_id"
  },
  "guest_contact_info": {
   "type": "object",
   "o:length": 1,
   "o:preferred_column_name": "DATA$guest_contact_info",
   "properties": {
    "email": {
     "type": "string",
     "o:length": 32,
     "o:preferred_column_name": "DATA$email"
    },
    "phone": {
     "type": "string",
     "o:length": 8,
     "o:preferred_column_name": "DATA$phone"
    },
    "address": {
     "type": "object",
     "o:length": 1,
     "o:preferred_column_name": "DATA$address",
     "properties": {
      "city": {
       "type": "string",
       "o:length": 16,
       "o:preferred_column_name": "DATA$city"
      },
      "country": {
       "type": "string",
       "o:length": 8,
       "o:preferred_column_name": "DATA$country"
      }
     }
    }
   }
  },
  "additional_requests": {
   "type": "array",
   "o:length": 1,
   "o:preferred_column_name": "DATA$additional_requests",
   "items": {
    "properties": {
     "type": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$type"
     },
     "details": {
      "type": "string",
      "o:length": 64,
      "o:preferred_column_name": "DATA$details"
     },
     "quantity": {
      "type": "number",
      "o:length": 1,
      "o:preferred_column_name": "DATA$quantity"
     }
    }
   }
  }
 }
}
>>>
>>> print(json.dumps(data_guide, indent=1)) { "type": "object", "o:length": 1, "properties": { "_id": { "type": "string", "o:length": 32, "o:preferred_column_name": "DATA$_id" }, "room_id": { "type": "string", "o:length": 4, "o:preferred_column_name": "DATA$room_id" }, "hotel_id": { "type": "string", "o:length": 4, "o:preferred_column_name": "DATA$hotel_id" }, "guest_name": { "type": "object", "o:length": 1, "o:preferred_column_name": "DATA$guest_name", "properties": { "last_name": { "type": "string", "o:length": 16, "o:preferred_column_name": "DATA$last_name" }, "first_name": { "type": "string", "o:length": 8, "o:preferred_column_name": "DATA$first_name" } } }, "num_adults": { "type": "number", "o:length": 1, "o:preferred_column_name": "DATA$num_adults" }, "total_cost": { "type": "number", "o:length": 4, "o:preferred_column_name": "DATA$total_cost" }, "checkin_date": { "type": "string", "o:length": 16, "o:preferred_column_name": "DATA$checkin_date" }, "num_children": { "type": "number", "o:length": 1, "o:preferred_column_name": "DATA$num_children" }, "checkout_date": { "type": "string", "o:length": 16, "o:preferred_column_name": "DATA$checkout_date" }, "payment_status": { "type": "string", "o:length": 4, "o:preferred_column_name": "DATA$payment_status" }, "reservation_id": { "type": "string", "o:length": 1, "o:preferred_column_name": "DATA$reservation_id" }, "guest_contact_info": { "type": "object", "o:length": 1, "o:preferred_column_name": "DATA$guest_contact_info", "properties": { "email": { "type": "string", "o:length": 32, "o:preferred_column_name": "DATA$email" }, "phone": { "type": "string", "o:length": 8, "o:preferred_column_name": "DATA$phone" }, "address": { "type": "object", "o:length": 1, "o:preferred_column_name": "DATA$address", "properties": { "city": { "type": "string", "o:length": 16, "o:preferred_column_name": "DATA$city" }, "country": { "type": "string", "o:length": 8, "o:preferred_column_name": "DATA$country" } } } } }, "additional_requests": { "type": "array", "o:length": 1, "o:preferred_column_name": "DATA$additional_requests", "items": { "properties": { "type": { "type": "string", "o:length": 16, "o:preferred_column_name": "DATA$type" }, "details": { "type": "string", "o:length": 64, "o:preferred_column_name": "DATA$details" }, "quantity": { "type": "number", "o:length": 1, "o:preferred_column_name": "DATA$quantity" } } } } } } >>>

Enter fullscreen mode Exit fullscreen mode

We can use this Data Guide for creating a table to make it very easy to run SQL over it, but this will be in a different chapter 😉

原文链接:Python Developer Track for Oracle JSON and Duality Views – Part 5: Indexing

© 版权声明
THE END
喜欢就支持一下吧
点赞14 分享
Life, there will always be the unexpected warmth and the endless hope.
人生,总会有不期而遇的温暖,和生生不息的希望
评论 抢沙发

请登录后发表评论

    暂无评论内容