Guide for Experimenters

1 Experiment Management

1.3 Best practices

Creating the FEDSpec

To consolidate the previous section, we encourage the experimenters to follow the best practices. These include:

  • Please do not use the template/provided FEDSpec’s as it is. The provided template is a generic template. We further encourage you to remove all #*# and with it the properties that are not used. For the properties you are using, change #*# with the actual values you want to use.  The data format of the values you provide should be same as that reflected in the examples. Further, use logical values. Please do not use Test, Blah Blah, etc, in the description and other places.
  • Have userID specified in the FEDSpec. You should modify #USERID#. Your FEDSpec will not be saved if #USERID# is different from the ACCESSTOKEN used.
  • fed:startTime should be less than fed:stopTime. Ideally, the difference between startTime and stopTime should not be 1 sec. If this is the case, this will only allow the experiment to be executed once in the lifetime. If this is a needed functionality we recommend to do so. However there is another option. Instead of scheduling the experiment on the Execution Engine, experimenters can use the Polling option available via Management Console. To avail Polling service, experimenters will first have to deploy the FEDSpec, but are not supposed to “START” the execution of a particular FISMO. By not starting the execution, the FISMO status remains “NOT YET SCHEDULED”. If only one execution is not the needed functionality please be realistic in the difference. Do not set #PERIODICITY# to 1 second for the queries. This will only overload your system, block the network without providing any insight. As a best practice we recommend you to use periodicity of not less than 600.
  • If you use dynamic query parameters please use them as specified for the query, i.e., using %%fromDateTime%%, %%toDateTime%%, %%geoLatitude%%, and %%geoLongitude%%. For other cases, other parameters used in dynamic query, use %%DA_NAME%%.
  • Before writing the queries we strongly encourage you to have a look at the documentation of the kind of data is currently provided and what is the semantic structure. We recommend you to write queries that would serve your purpose and are efficient. Please DO NOT:
    • Write queries that are irrelevant for your experiment. This will have overloading effect on your Experiment.
    • Use overly complex queries for simple cases unless no other solution.

We would be happy to help you write efficient queries. However, as an experimenter we expect that you first write queries and we validate. Additionally do not use “Select * Where {?s ?p ?o.}” query, this goes against our principles and shall be completely prohibited. More details on the best practices for efficient querying is provided in the next section.

  • Note that the Execution Engine uses UTC timezone. Thus all your queries should be using UTC timezone. Moreover, the fed:startTime and fed:stopTime should also be specified in UTC timezone.
  • Sync the #PERIODICITY# and the used time interval in the query. If not synchronized, it is possible that you get redundant data that is not useful. Generating redundant data will again block resources and will be not efficient.
  • Please test the #URLLOCATION# before providing it. Please make sure that the URL is operational and accepts incoming data. To help experimenters, a sample code is provided that opens a URL that accepts incoming data and stores it in a file. Note that, as EEE is operating in an ASYNC mode, the result when available is sent to the experimenter. An experimenter should periodically check if the data is available on their server or not. A very simple solution to not write a periodic checker is to get the file that is most recently created. All the files that are sent are time-stamped. The naming convention followed to name the such files is 

String filename=JOBID.replace(“-”,””)+URLLOCATION.replace(":", "").replace("/", "_")+”_”+LONG_TIMESTAMP;

Note here JOBID is a UUID that is set by EEE, URLLOCATION is the location that you provide and LONG_TIMESTAMP is milliseconds after epoch.

  • In case you are executing the experiment using FEDSpec, you need to install Experiment Data receiver module on your server. This module will enable a valid #URLLOCATION# that can be utilized in the FEDSpec. The Experiment Data receiver installation guide is available at the following link

Please note that the module is currently tested on Tomcat and creates a #URLLOCATION# that looks like http(s)://HOST:PORT/ExperimentServer/store. In case you are using HTTPS please make sure that you are using LetsEncrypt/JVM Already trusted Certificates/Terena SSL root certificate. In case any other certificate is used the Experiment Execution Engine will not be able to send the resultset to #URLLOCATION#. Self-signed certificates will not work. In case you use HTTP, there is no such restriction. The priority is given to the link that is specified. In case this URL is not reachable for some reason, the results are stored locally within FIESTA-IoT repositories for experimenters to later download the results. Note that the experimenters have to create their own java code to download the results. The complete description of the API that provides this functionality is described later

In case you decide to directly call the IoT-Registry APIs, you need not install this component. Nevertheless, you will have to deal with authenticating yourself every time you are calling IoT-Registry APIs, scheduling your experiment with in SYNC periodicity and time interval used in the query (Still you need to follow #3 above).

  • In the fed:presentationAttr value of the widget attribute please follow the guidelines as said in the section above. The value should be a JSON string that is of form:

{
 "Method": ["Method 1"," Method 2"," Method 3"],
 "Parameters": ["Parameters 1", "Parameters 2", "Parameters 3"]
}

  • Please send us your FEDSpec and the queries (for advanced experimenters) that you intend to execute, get it validated before you proceed with registering the FEDSpec into the system or using the query and directly calling the IoT-Registry API. In case you want to test your queries, before providing them (in the FEDSpec and scheduling them, giving it to us to validate) you can still yourself validate the queries. Note that here ve refer to just getting a feel of what data is available and if the syntax of the query is correct. Based on our recommendations about your experiments you need to decide if creating FEDSpec is a best option of directly using IoT-Registry is best for you.

Note that there are 2 aspects that we mentioned before validating the syntax and getting a feel of the data. You can validate the syntax of the query using http://www.sparql.org/query-validator.html. This will ensure that your query is executable on Fiesta−IoT Platform. To get a feel of what the query will return you can use any REST Client (Postman, Google Chrome’s Advanced REST Client, Curl, etc.) to send your request to Fiesta−IoT Platform. There are 3 main APIs that you need to deal with:

The Authenticate API will give you an access token that you will have to use to call the other 2 APIs. Using these APIs you can execute your query and test them. Note that the APIs used should be based on which graph you want to query.



Writing efficient Queries

We list below the best practices to query the system. Further we also ask you to consult the following tables to know what kind of data is currently present in FIESTA-IoT repository:

SmartSantander

SmartICS

KETIs Mobius

SoundCity

○      If you run first a resource discovery, you can harness sensor information to save much time in further queries (i.e. observations based on node ids).

○      If your experiment aims at short-term data (not historical values), another thing that can save time is the usage of IoT-Service endpoints instead of raw SPARQLs queries. However, please note this concept is not available for all the sensors. It can be used wherever provided. In the ontology it is represented via iot‑lite:Service.

○      In case you do not need the complete graph structure, please do not query for all the concepts and properties.

○      The entire structure is divided into 2 graphs: Resource graph and observation graph. The queries must be directed to either of these graphs based on the requirements. Query these graphs is resource efficient. In case both graphs are to be queried, you must use another graph called “global”. We recommend you to not query this graphs unless essential.

○      Learn and understand the meaning of the FILTER, GROUP, LIMIT, BIND etc. options, and try to use them if possible. Note that adding such keywords slows down the query execution. For example, bind significantly degrades the execution time.

○      Try, to the extent possible, to avoid the extraction of large amounts of data at once (e.g. >5MB). In this case, split your queries into various ones; for instance, sweeping the time into small windows, etc.

○      There is a possibility that the dul:dataValue returned is a NaN. This NaN is mainly reported currently by the SmartICS testbed. Thus, it is useful, in case you do not want to receive observations that have NaN value to filter such observations. In FIESTA-IoT, currently some observations have dul:dataValue as NaN while some have dul:dataValue as NaN^^xsd:string. Note the absence of data-type in the first case.

            A filter command looks like

            FILTER (?dataValue != NaN^^xsd:string || ?dataValue != NaN)

The dul:dataValue currently can return following datatypes xsd:int, xsd:double, xsd:dateTime, xsd:boolean, xsd:string. Thus it is of utmost importance that experimenters look into what kind of data they need and understand the mappings between QuantityKinds, Units and datatypes.

○      Each sensor/resource has EXACTLY one QuantityKind and Unit associated to it. Please refer to Testbed documentation (link below) to understand what phenomenon is being mapped to which m3-lite concept.

■      Some other relevant documents are available at:

○      IoT-Service endpoints are a good deal when it comes to get the last observations carried out by the sensors. However, there is a number of points that has to be considered beforehand;

■      FIESTA-IoT does not specify the format of the response messages (in the current version of the platform). This means that every testbed might follow a different data set. Thus, experimenters have to manually parse them.

■      It is worth highlighting again that these services are not mandatory for testbeds, so they might (or might not) decide to include these endpoints as part of the resource description. Indeed, up to today (1st Aug 2017), 2 out of 4 testbeds in the federation (i.e. SmartSantander and SmartICS) do offer this possibility.

■      Even though there is a kind of de-facto agreement on the actual use of the endpoints, that is, to expose the last measurement observed by a node, this is not an official standard. Consequently, testbed providers might use the endpoints for a different purpose.

FIESTA-IoT testbeds are deployed (SoundCity testbed is actually a crowdsensing platform and is not bound to a particular physical location) on different geographical areas: i.e. Spain (SmartSantander), UK (SmartICS) and South Korea (KETI). Furthermore, more than 6 testbeds will gradually the federation. This means that their data will come from different timezones and different format.