HM-10 (and HM-11) BLE – the good, the clones, and the down-right fake!

The HM-10 and HM-11 modules, originated by Jinan Huamao Technology look quite useful for tinkering about with Bluetooth, and their documentation is better than average and there is supplementary information on the web (e.g. Martyn Currey’s blog). The scourge of clones for this kind of module is not news to me but it turned out to be harder than I expected to get a satisfactory product from ebay or Aliexpress.

I would ideally like a genuine product, and while there are sellers proclaiming “genuine” in their listings, this often comes with an excessive price. Other than that, price seems to be a poor guide to quality. I consider a “clone” to be something which is functionally correct but not original and maybe with lower quality hardware. A module with the wrong firmware listed as “HM-10” is a fake, and if the core bluetooth chip (for these modules it is the Texas Instruments CC2541) is not correct it is a down-right fake. The last case is useless as it won’t be possible to load the HM-10 (or HM-11) firmware, which is available for download. Re-flashing is not particularly difficult, but generally involves some “hacky” soldering. Partially to alleviate the hassle of re-flashing but also to give me access to more of the CC2541’s pins (the HM-10 has lots of GPIO, ADCs, DS18B20, DHT11, and a poor-mans PWM), I designed some breakout boards and got a batch made. The Eagle CAD files are on GitHub, along with some links to software/instructions on reflashing.

It’s not that I wasn’t aware of the potential for fakery; I tried to assess listings by checking for ambiguous or garbled wording (almost universal!), clearly inaccurate descriptions, and checked the photos to see “CC2541” on the chip. I thought the most likely outcome was getting a module with an out-dated version of the hardware. I was wrong!

HM-10 Fail #1

On May 6th 2023, I bought two “HM-10” modules on breakouts from ebay seller Alimodule. The total price was £8.02, by no means the cheapest on sale. The breakout has a 5V regulator and [probably] some level shifting to allow it to be used on a 5V Arduino. I really wanted 3.3V supply and logic to work with an ESP8266, but I reckoned these would be OK to experiment with, and that I could butcher the breakout to achieve this. The downside of these boards is that they only expose the UART + status and key pins. The upside is that they are widely available.

After some stumbling about trying to follow the Huamao manual (and failing) it became clear that the firmware was wrong. THe device appears as “BT05” (which I later found out to imply some Bolutek firmware, although I never found documentation for it) and the result of AT+HELP shows the commands are wrong for HM-10.

I tried multiple times to flash new firmware, using both ESP8266 CCLoader and Raspberry Pi CC2541 programmer (see the GitHub link above) and got nowhere. The firmware appeared to upload correctly but the module appeared to be bricked. After SOME TIME, I realised that the chip id which is reported by the RPi programmer showed the chip was a CC2540, a TOTAL FAKE, even though the chip is clearly marked CC2541. Although these two chips are compatible at the source code level (I believe), they are not compatible at the level of compiled firmware. I tried REALLY HARD to find some firmware to run on the CC2540 but failed, so while the product which arrived would have worked as a simple serial-over-ble device, I now have junk!

I got a full refund, but no compensation for the lost time in working out I had a deliberate fake.

Avoid Alimodule!

HM-10 Fail #2

This time and for all the attempts below, I got “stamp” SMD modules to suit my shiney new breakout PCBs.

Ebay seller h-quality_electronic had a somewhat garbled description but a clear title – “HM-10 4.0 Bluetooth UART Transceiver Module Serial Port CC2541” – and a picture which looked OK; although it was missing one of the two SMD crystals (a common sign of a cost-reduced knock-off), the chip was right. I ordered 2 pieces for £6.80 delivered.

Unfortunately, it was the same story as before. The firmware claimed to be BT05 (and this was marked on the board) but many of the commands shown by executing AT+HELP simply returned “ERROR”. The chip id was read to disclose that this was another total fake with a CC2540 marked as CC2541. Seriously, someone went to the trouble of faking the markings on chips on a module only retailing for a few pounds!

This one ended with ebay customer support issuing me with a refund. Avoid this seller.

HM-10 Fail #3

Ebay seller cayin35 had a credible listing; it only mentioned CC2541, gave specific firmware versions, and the picture looked right. With only a few inconsistencies in the description I went for it: 2 items delivered for £6.82.

Sadly, the same story: crappy BT05 firmware with commands that don’t work and a chip which turned out to be a CC2540 when interrogated by a programmer. More TOTAL FAKERY.

A full refund again and another seller to avoid!

Now I am getting a bit pissed off. It really shouldn’t be this hard to get an item as it is described. This isn’t a matter of minor detail; these devices are NOT CC2541 and NOT HM-10.

HM-10 Success (partial)

Four boards from Aliexpress seller “Advanced Tech” cost me £12.91 including tax and postage. The description seemed mostly accurate for a HM-10, with the exception of chip ambiguity at the bottom of the description: “HM-10 CC2541 CC2540 4.0 Bluetooth UART Transceiver Module Transparent Serial Port”. I believe very early HM-10s did use the CC2540, which might explain how this comes about, but still. The picture did show a CC2541 and two SMD crystals (which is generally a sign of not cutting corners), so I gave it a shot.

These arrived today, August 10th 2023. It really has taken since May to cycle through order – wait – receive – test …

The hardware appears OK and the items received do look like the listing picture. The firmware advertises itself as HMSoft via a BLE scanner but once I started to interact with the device using AT commands it quickly became clear this is crap: AT+HELP? lists commands which dont all work (e.g. AT+VERSION returned nothing). AT+VERS? (which is the command used in a real HM-10) DID return a version string: “HMSoftV004”. Obviously nonsense. Firmware is a FAIL!

The Bluetooth chip is, however, a CC2541 (or at least it reports a chip id which matches, when queried by a programmer). And yes, miracle… I managed to flash it with a v540 firmware using the Raspberry Pi CC2541-programmer software. This isn’t the latest firmware but it is what I had in “.bin” form, and it supports updating over the serial interface using AT+SBLUP.

I’d have preferred not to have the hassle of a re-flash, but I’ll consider trying this seller in future as the evidence so far is that trying a new seller is more likely to waste my time and give me more junk than not.

HM-11 Fail

My first attempt to get some HM-11 was from ebay seller satisfyelectronics. Two items for £4.05 including tax and postage should maybe have alerted me to fakery but the title lacked the usual ambiguity about chip: “Bluetooth 4.0 module BLE CC2541 low power NEW HM-11 S“.

When the modules arrived it was immediately clear these were not HM-11 devices. The chip is not a Texas Instruments chip (CC254x)! By comparison with the Huamao datasheet and my HM-11 breakout boards it is obvious thath the module solder pads are all wrong. I should have spotted that in the photo. I can’t even test these and don’t know what they really are.

What egregious mis-description! I got a full refund.

Avoid satisfyelectronics!

HM-11 Success!

Aliexpress seller “fYD Open Source Hardware“. I bought 4 for £12.98 including tax and postage. The listing did give me some cause for concern as the title included “CC2540” (wrong!) and “CC2541”. The picture did show a CC2541 and two SMD crystals (which is generally a sign of not cutting corners), so I gave it a shot.

These were either genuine or good fakes. Hooray! The firmware was not the latest, being v6xx, but that is new enough to allow for firmware updating via the serial port using “AT+SBLUP” and the Huamao upload tool. The listing actually stated firmware v508.

I suppose there is no guarantee that they still have the same batch, and some of their other listings had garbled descriptions, but I would try them again.

Final Words

It is pretty clear that pictures and descriptions are not a good guide. If you find yourself in the same situation, make sure you check the chip id. If it is wrong demand a refund and avoid that seller.

Using Z-Stack ZNP to Make DIY Zigbee Devices to Work With Zigbee2MQTT – Part 2

At the end of part 1, I had got as far as joining my ZNP board to the Zigbee2MQTT (Z2M) coordinator PAN, and seeing the series of attempts by Z2M to interview my device. This is the point at which Z-Tool is evidently limited; it is not really feasible to respond to Z2M messages in a timely manner. To be fair, Z-Tool does have a scripting facility using a Javascript engine, but I didn’t find documentation on the ZNP API to match, so… I used Python and worked at the level of UART messages. Working at this level will make transferring what is learned into a MCU-based device, while using Python reduces the friction of a compile-upload-test cycle while learning. The Python code is available on github.  This not meant to be a library/package and was substantially written as a learning activity. There are published Python packages, zigpy and zigpy-znp, but I felt that: a) using a full-featured package was likely make it harder to understand how things work; b) the comment “Zigpy is tightly integrated with Home Assistant’s ZHA component” suggested I would run into issues and complexity. Additionally, the Python code is just to demonstrate what the interactions should be for a MCU-based system, for which I will need to be working at a lower level than a Python object model. That said, there are some potentially-useful modules in zigpy-znp, which are not tightly integrated to ZHA and would be worth considering if using a Raspberry Pi etc as the basis for a Zigbee ZNP-based device.

Staying Sane and in Control

By the very nature of experimentation, there will be changes in intent and design, errors, etc. Staying in control is helped by removing previously-joined devices from Z2M, making sure the device is set up from a restart, and then re-joining. This isn’t always necessary but…

There is also ample opportunity to get inconsistent set-up in Z2M. If the cluster specification in AF_REGISTER changes in an incompatible way (e.g. the change is not simply an addition), the information which Z2M retains, even when removing and re-joining can mess things up. In this case, stop Z2M, go to the “data” folder of the Z2M installation and remove coordinator_backup.json and database.db, and restart Z2M. This will completely mess up any existing Zigbee network.

Setting up a sand-box coordinator + Z2M installation is probably a good idea to avoid messing up an existing network but also because having lots of devices and even one router is going to make Wireshark captures get very busy indeed. Use a different Zigbee channel for the sandbox; as well as avoiding contention between the radios, this means you can assign the channel Wireshark to only sniff the sandbox messages.

I’ve also found it is generally a good plan to keep Z2M with joining disabled until you want to join a device, after having started Wireshark and generally getting organised.

Z2M Requests Attribute Values – the “Interview”

Stepping back from the Z2M interview…

I am interested in three kinds of interaction between the ZNP board and coordinator/Z2M: Z2M requests attribute values, Z2M requests an action, and the ZNP board sends reports. All of these are in the context of Zigbee Clusters and use Zigbee Cluster Language (ZCL) messages embedded within the ZNP commands sent over the serial interface to the ZNP board. It is important to note that both the ZNP messages and the ZCL messages embedded within them (as the “data” of the ZNP message) have parts with the same (or similar) names and functions: a sequence identifier, a command identifier, and a body/payload/data. The use of a Zigbee sniffer, such as ZBOSS + Wireshark, really helps with interpreting this nested structure and debugging (e.g. to spot malformed ZCL).

Referennce [R6] (see part 1) describes the structure of a ZCL message frame (section 2.4, “Command Frame Formats”), lists the permitted ZCL command ids, and specifies the content of the ZCL message payload for each type of command (section 2.5). The commands relevant to my “three kinds of interaction” are: 0x00 (read attributes), 0x01 (read attributes response), and 0x0a (report attributes). These are all “global” commands in the sense that these commands ids have the same meaning for all clusters. The alternative kind of command is “specific”, meaning that the command id has a meaning which is specified separately for each cluster. Whether the command id is interpreted as global or specific is indicated in the ZCL frame control field (FCF) via bits 0 and 1 of the FCF (see [R6]). The way Zigbee is designed, things like turning (e.g.) a lamp on and off are achieved by sending cluster-specific commands, rather than by writing attributes, whereas the current state of the lamp is determined by reading attributes. This is why I said “Z2M requests an action” in my list of kinds of interaction, above.

Returning to the interview, which is a set of messages requesting the values of attributes in the Basic Cluster.

ZNP passes requests to read attributes to its serial interface as an AF_INCOMING_MSG (see [R2]). The message contains information about the endpoint, source network id (this will be 0x0000, the coordinator), cluster identifier, etc and a ZCL message as its “data” component. For the interview, the cluster id will be 0x0000 (Basic Cluster) and the ZCL command id will be 0x00 (read attributes). Consulting section 2.5.1 in [R6], we can see that the ZCL message comprises a standard ZCL header + a list of one or more two-byte attribute identifiers (Z2M will sometimes issue requests for one attribute per incoming message, and sometimes request several attributes at the same time). The meaning of the attribute identifiers for the Basic Cluster is given in section 3.2.2.2 of [R6].

Care must be taken when handling the cluster id and attribute ids (and any multi-byte component) because little-endian byte order is used. Wireshark is your friend in checking for byte-order bugs.

The ZNP which should be sent in response to an AF_INCOMING_MSQ is AF_DATA_REQUEST (yes, it is called “REQUEST”), which will in turn give rise to ZNP sending both AF_DATA_REQUEST_RSP and AF_DATA_CONFIRM messages. A sequence identifier in AF_DATA_REQUEST matches that in AF_INCOMING_MSQ. My approach is to always wait for a “RSP” before proceeding and to log AF_DATA_CONFIRM as they arrived (this is OK for the Z2M interview, but maybe not always). Note that these messages do not necessarily mean that the attribute values you sent back have actually been handled by Z2M.

Here is a fragment of the annotated log which my Python script emits for the interview, comprising the interactions for a single attribute in the Basic Cluster:

RX body: Cmd: 4481 Body: 00 00 00 00 00 00 01 01 00 31 00 3b 3d 01 00 00 07 10 10 00 05 00 04 00 82 3d 1d
Incoming ZCL for endpoint 1, cluster id = 0000 is: 10 10 00 05 00 04 00
ZCL Command = read attributes (0x00)
Sending response...
TX: fe 24 24 01 00 00 01 01 00 00 00 00 10 1a 18 10 01 05 00 00 42 08 5a 4e 50 2d 54 65 73 74 04 00 00 42 05 41 52 43 31 32 02
[Response] RX body: Cmd: 6401 Body: 00
AF_DATA_REQUEST_RSP success? True
--------------
RX body: Cmd: 4480 Body: 00 01 00
AF_DATA_CONFIRM for TransId=0 on Endpoint=1 had Status=0

The request is a composite requst for the values of two attributes: Manufacturer Name (0x0004) and Model Identifier (0x0005). The response (notice the ZCL command is now 0x01 = read attributes response, and that the ZCL FCF shows the client-server direction is reversed) contains strings for the attribute values. Strings are data type 0x42 (table 2-11 in [R6]) and the “value” given in the data has the length of the string as the first byte, followed by the character codes. Section 2.5.2 of [R6] shows how each attribute requested gets a chunk in the respons comprising: 2 bytes of attribute id + 1 byte status (=0x00) + 1 byte data type (=0x42) + 1 byte string length + n bytes of string characters.

The Wireshark log is helpful to understand and verify what went on. The messages are built up in multiple (nested) layers. The ZCL forms the inner-most, with the Zigbee Application Support Layer above. These are the two most relevant for understanding the ZNP interactions; the other layers are concerned with the underlying network interactions.

The “Read Attributes” and “Read Attributes Response” to match the Python log above are both marked as having protocol = “Zigbee HA” – Wireshark has detected we’re using the Home Automation profile. The Wireshark fragment for the request is:

ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 1
  Frame Control Field: Data (0x00)
  Destination Endpoint: 1
  Cluster: Basic (0x0000)
  Profile: Home Automation (0x0104)
  Source Endpoint: 1
  Counter: 145
ZigBee Cluster Library Frame, Command: Read Attributes, Seq: 16
  Frame Control Field: Profile-wide (0x10)
  Sequence Number: 16
  Command: Read Attributes (0x00)
  Attribute: Model Identifier (0x0005)
  Attribute: Manufacturer Name (0x0004)

And for the response:

ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 1
  Frame Control Field: Data (0x00)
  Destination Endpoint: 1
  Cluster: Basic (0x0000)
  Profile: Home Automation (0x0104)
  Source Endpoint: 1
  Counter: 3
ZigBee Cluster Library Frame, Command: Read Attributes Response, Seq: 16
  Frame Control Field: Profile-wide (0x18)
  Sequence Number: 16
  Command: Read Attributes Response (0x01)
  Status Record, String: ZNP-Test
  Status Record, String: ARC12

Digging into the component parts of the ZCL in Wireshark, and seeing how elements relate to the bytes-level view should clarify the comment above: “each attribute requested gets a chunk in the respons comprising: 2 bytes of attribute id + 1 byte status (=0x00) + 1 byte data type (=0x42) + 1 byte string length + n bytes of string characters”.

Beyond the Interview – a Switch and a LED – the Generic OnOff Cluster

Defining two endpoints on the ZNP board, one for a switch and one for a LED, is sufficient to explore the “three kinds of interaction” through: the Z2M web dashboard, Wireshark, and serial interface (whether via Z-Tool, Python, etc). NB: I use “switch” to mean something like a push button or toggle switch, rather than as a synonym for a relay (the practice of using it as a synonym for relay should be strenuously avoided as it makes the word “switch” become ambiguous).

The AF_REGISTER command (see part 1) is used twice, once for each endpoint. Both the LED and switch use cluster 0x0006 (generic On/Off). For the switch, I added cluster 0x0006 as both an input and output to endpoint 1. The input allows the switch to be remote-controlled from Z2M whereas the output is used to inform the coordinator/Z2M of a change in switch state. State changes are sent as ZCL reports (ZCL command id = 0x0a) whenever the switch changes state. The same kind of ZCL message can also be used for periodic reporting, e.g. for a sensor. I put the LED on endpoint 2. In this case, the appearance of 0x0006 in the in-cluster list is what allows the LED to be switched on and off (analogous to remote controlling the switch) and in the out-cluster list it means the LED state is reportable (in my Python I use a periodic report for this, for demonstration purposes, but an event-driven report would be more sensible in practice).

After issuing ZDO_STARTUP_FROM_APP, that is all that is needed to set the ZNP board up. The real work is done handling incoming ZCL and preparing outgoing ZCL. We do, of course, need to have a converter JS file in Z2M too. There is a rather hacky converter in the github repo, which will work but may not be optimal!

I will now outline what happens for each of the remaining “three kinds of interaction” (reading the state of the LED/switch simply being the same pattern as reading the Basic Cluster attributes) in the context of this simple test-device.

Reporting Switch Changes and LED State

These two are identical in the structure of the ZCL, which is very close to that used in the “read attributes response”; the only differences in the ZCL are the command id (0x0a) and the report data structure does not have a status byte. See the frame format for the Report Attributes Command [R6]. The attribute id for on/off state is 0x0000 and for the ZCL we need its data type (boolean, type id = 0x10) and one byte for on (0x01) or off (0x00). Remember that the switch and LED are on different endpoints, but also recall that the endpoint is included in the data sent via an AF_DATA_REQUEST ZNP message. This is the message to use although, again, having “REQUEST” in the name seems wrong for a report; its just the way it is! As for the Read Attributes Response case, expect AF_DATA_REQUEST_RSP and AF_DATA_CONFIRM in response to the report message.

Reference to section 3.8 in [R6] describes other attributes which a generic on/off device may support.

Here is my Python log for reporting when I pressed the button linked to endpoint 1:

TX: fe 11 24 01 00 00 01 01 06 00 00 00 10 07 18 00 0a 00 00 10 01 26
[Response] RX body: Cmd: 6401 Body: 00
AF_DATA_REQUEST_RSP (for report) success? True
--------------
RX body: Cmd: 4480 Body: 00 01 00
AF_DATA_CONFIRM for TransId=0 on Endpoint=1 had Status=0

And the Wireshark capture to match:

ZigBee Application Support Layer Data, Dst Endpt: 1, Src Endpt: 1
  Frame Control Field: Data (0x00)
  Destination Endpoint: 1
  Cluster: On/Off (0x0006)
  Profile: Home Automation (0x0104)
  Source Endpoint: 1
  Counter: 2
ZigBee Cluster Library Frame, Command: Report Attributes, Seq: 0
  Frame Control Field: Profile-wide (0x18)
  Sequence Number: 0
  Command: Report Attributes (0x0a)
  Attribute Field
    Attribute: OnOff (0x0000)
    Data Type: Boolean (0x10)
    On/off Control: On (0x01)

Remote Control

This is about using the Z2M front end or MQTT publish activity to control the ZNP board device.

Turning the LED on and off and remote-controlling the switch are identical as far as the ZCL goes. In our case, the only difference is the endpoint which comes through with the AF_INCOMING_MSG. The ZCL has a different structure to the previous cases seen and the ZCL FCF is what signals the difference. Recall that attribute requests had a FCF of 0x10 and 0x18 for the response and reports. These commands have the same meaning no matter what cluster is involved. The important conceptual point is that the ZCL for turning something on or off is not the same as setting the attribute for state; it is an action which has specific meaning only within the context of the genOnOff cluster (0x0006). The FCF signals the fact that the command is “local or specific to a cluster” (see [R6]) via bit 0, which is unset for “global” (any-cluster) commands and set for local commands; checking the FCF is key to handling these messages correctly. The remainder of the ZCL after the FCF is very simple; first there is the usual sequence number, then there is a single command byte, where 0x01 means “turn on” and 0x00 means “turn off”. Section 3.8.2 of [R6] describes four other commands for use cases such as off-with-fade, on-with-timed-off, etc.

Having received the command, we should reply to Z2M because the FCF says a default repsonse is expected (bit 4, “disable default response” is not set). A default response is simply a ZCL message with a command id of 0x0b and a short payload indicating the command which was sent and a byte for success/fail (see [R6] 2.5.12). This is a global, not a cluster-local, command. Our default response message leads to a AF_DATA_REQUEST_RSP and AF_DATA_CONFIRM, as for any other AF_DATA_REQUEST.

Here is the message log from my Python script for turning the LED on:

RX body: Cmd: 4481 Body: 00 00 06 00 00 00 01 02 00 3c 00 40 16 03 00 00 03 01 29 01 82 3d 1d
Incoming ZCL for endpoint 2, cluster id = 0006 is: 01 29 01
Local/specific command to cluster 0006
=> device on endpoint 2 set to: on
Sending response...
TX: fe 0f 24 01 00 00 01 02 06 00 00 00 10 05 18 29 0b 01 00 01
[Response] RX body: Cmd: 6401 Body: 00
AF_DATA_REQUEST_RSP success? True
--------------
RX body: Cmd: 4480 Body: 00 02 00
AF_DATA_CONFIRM for TransId=0 on Endpoint=2 had Status=0

And here is the Wireshark capture for the same on command:

ZigBee Application Support Layer Data, Dst Endpt: 2, Src Endpt: 1
  Frame Control Field: Data (0x00)
  Destination Endpoint: 2
  Cluster: On/Off (0x0006)
  Profile: Home Automation (0x0104)
  Source Endpoint: 1
  Counter: 151
ZigBee Cluster Library Frame
  Frame Control Field: Cluster-specific (0x01)
  Sequence Number: 41
  Command: On (0x01)

and the default response (just the ZCL part this time):

ZigBee Cluster Library Frame, Command: Default Response, Seq: 41
  Frame Control Field: Profile-wide (0x18)
  Sequence Number: 41
  Command: Default Response (0x0b)
  Response to Command: 0x01
  Status: Success (0x00)

Footnote

While these two posts are a long way short of a comprehensive tutorial, I hope they provide enough structure to make it easier to understand what is going on and to work out the details. I found it very helpful to work carefully through the content of messages both at the UART/ZNP level and at the Zigbee sniffer (Wireshark) level, correlating the bytes with the relevant documentation.

If you find errors, especially where these are misunderstandings of how ZNP and Zigbee work, please do comment.

Using Z-Stack ZNP to Make DIY Zigbee Devices to Work With Zigbee2MQTT

Z-Stack ZNP (Zigbee Network Processor) allows you to make a Zigbee device by running ZNP on the Zigbee chip (CC2530 etc) and a microcontroller to read sensors, actuate motors, display stuff etc, using a basic communication protocol (UART serial or SPI) between the two. This saves you from having to compile code for the Zigbee chip (which requires a costly IDE for the CC2530) and means you can, for example, work with user-friendly libraries such as Arduino (whether for Arduino hardware, ESP8266, etc…).

Unfortunately, the documentation for ZNP is not great. Information is spread over several documents. There are errors. There is a distinct lack of information which “walks you through”. This article is my notes on how I got things to work, with a few caveats: I am working with a Zigbee2MQTT environment running @KoenKK’s Z-Stack HA 1.2 coordinator firmware and am explicitly working only within the scope of the Zigbee 1.2 Home Automation Profile on a CC2530 device. I see no reason why what follows could not be applied to Zigbee 3 or to chips other than CC2530 with a few tweaks.

Getting Started

Download Z-Stack Home 1.2 from the Texas Instruments Z-Stack archive (it should be the one named “Z-STACK-HOMEZigBee Home Automation Solutions“, with an installer file name of “Z-Stack_Home_1.2.2a.exe”. This provides you with the Z-Tool and some documentation. Do not use the firmware included! There is source code, which can sometimes be useful to work out details which the documentation does not explain.

Download the @KoenKK Home 1.2 coordinator firmware for your hardware (CC2530 etc). This is almost the same as the ZNP firmware available in the TI Z-Stack download but has a few modifications and different compiler flags set. These differences are essential! Although this is designated as “coordinator”, it is possible to change the role to end-device using ZNP. This is what we’ll do.

I’m using my E18 boards (see my E18 breakout post), but any basic development board based around the same CC2530 chip should work just the same, so long as it exposes the UART pins (see below). Ebyte sell a dev board with an E18 already mounted and an onboard USB-serial converter (but you will need to use jumper wires to connect the the CC2530 UART I/O to this). I will use “ZNP board” to refer to this hardware.

Flash the ZNP board with the @KoenKK firmware.

I’ll be using Zigbee2MQTT (hereafter “Z2M” and the ZBOSS/Wireshark sniffer as I outline my DIY Zigbee post. The sniffer is very useful for observing the sequence and content of messages and will helpfully show when you try to send malformed messages (these are often not reported by ZNP, which cheerfully responds with “success” messages).

For the purposes of exploration and learning, I use a PC to communicate with the ZNP board, rather than a micro-controller. This makes it much easier to experiment, while knowing that the same UART messages send from a MCU would have the same effect.

The system outline is: [PC – USB/Serial adapter] << UART serial >> [ZNP board] << Zigbee wireless >> [Zigbee coordinator dongle] << serial tty >> [Zigbee2MQTT].

At the PC end, the ZNP commands can be generated and sent “as you please”. I used Z-Tool (see below), some Python code, and (mostly just to show it worked) Termite with the “hexadecimal view” filter, my favourite serial terminal. ZNP commands are just a bunch of bytes!

While the TI ZNP firmware is compiled to require hardware flow control (which Arduino libraries do not have), and to require configuration (SPI vs UART) via a hardware pin, the @KoenKK does not (see the “patch” file in the github repo if you are interested to see the changes). It only supports UART comms and uses P0.2 for RX and P0.3 for TX. Remember to cross-over TX and RX between the USB-serial adapter and the ZNP board. It also exposes four GPIOs via ZNP on P0.6, P0.7, P1.6, and P1.7.

Unfortunately, the UART comms option doesn’t come with any power saving support, whereas the SPI option does (although how effective it is I do not know). The bottom line is that the ZNP board seems to draw a continuous 23mA  approx, which would be hopeless for a battery device. For a sensor/output-oriented device it would be feasible to use a MOSFET to kill the power to the E18 before putting the MCU to sleep, but devices which should be remote-controlled or queried will just have to be mains powered. I don’t suppose this is such a big deal as such devices are likely to have more energy consumption than a battery is likely to be suitable for.

Note: I will use “Termite” to mean “use the serial terminal of your choice” and “Wireshark” to mean “use the Zigbee sniffer of your choice”.

Documents

The main references which I found to be useful, specifically those relevant to HA 1.2, are:

[R1] “Z-Stack ZNP Interface Specification” – outline of ZNP, esp the physical interface (for various SoCs and devkits) and message frame structure.

[R2] “Z-Stack Monitor and Test API” – basic message and response structure for commands shown in Z-tool (except the Simple API). This is the core reference for ZNP message structures.

[R3] “Z-Stack Simple API” – general description of Simple API + index of configuration ids.

[R4] “Developing a ZigBee System Using CC2530 ZNP” – this contains a very useful walkthough of ZNP commands but doesn’t explain much. (Lamentably, TI seem to have removed this from their online document library.)

[R5] “ZigBee Home Automation Application Profile” – contains prescribed profile Id, device Ids, and cluster Ids applicable to HA 1.2. There is also a more recent Zigbee Cluster Library specification [R6] which is applicable to Zigbee 3.0 (and will surely have incompatible elements alongside stuff which works!). HOWEVER: this HA profile does not document the ZCL frame structure or the Basic Cluster attributes; there is presumably a Z1.2 document for these… but [R6] seems good enough. (Lamentably, the Zigbee Alliance, now CSA, no longer make this available. Bad people!)

[R6] “07-5123-08-Zigbee-Cluster-Library” – revision 8 of the Zigbee Cluster Library, aka “ZCL”.

[R1] to [R3] may be found in the Documents folder of the Z-Stack installation.

First Test – Termite and Z-Tool

Having connected P0.2 and P0.3 as noted above, connect with Termite at 115200 baud (at this point it is best not to have the sniffer dongle also attached, so you can easily see which the correct serial port is). Hit the ZNP board reset button and watch for its reset indication message to appear in Termite. This message is called “SYS_RESET_RESPONSE” (but some of the documentation calls it “SYS_RESET_IND”). Note that the message is framed: there is a “start of frame” (SOF) byte 0xfe, followed by the message body, and terminated by a single “FCS” checksum byte. The checksum uses the XOR8 algorithm (look it up). The message body starts with 1 length byte, followed by a 2 byte command and then a variable length of data. Review the message alongside the documentation for the message frame structure and the SYS_RESET_RESPONSE message type. Get some paper and check the checksum! Note that the command id (0x4180) is useful for looking up in the documentation when the “friendly” name has been changed.

The same message structure is the basis for all ZNP messages.

Close Termite’s connection to the serial port and start up Z-tool (it can be found in the Tools folder of the Z-Stack installation). Unfortunately, Z-tool defaults to the wrong baud rate. Look in Tools > Settings and change it to 115200 for the serial port in question. The setting persists for each serial port.

It should scan for and find your device (use Tools > Scan for devices to repeat the scan). It can take a while for the ZNP board to become visible on a cold start, but it should come up quickly once you’ve seen it appear in Termite. Hit the ZNP board reset button again and correlate the Z-Tool presentation with the raw serial message in Termite.

Now experiment with a few simple commands such as SYS_VERSION, SYS_PING, and SYS_RESET. UTIL_GET_DEVICE_INFO is also quite informative. You should be able to see the hardware MAC address (aka IEEEAddr) and the ShortAddress, which is used within the PAN. DeviceType is a little confusing; it shows the roles the firmware is capable of operating as, rather than the role which it currently has. The DeviceState for an unjoined device should be 0 = “DEV_HOLD”. More on device states later.

If feeling adventurous, wire some LEDs to the GPIO pins mentioned above and experiment with UTIL_LED_CONTROL and SYS_GPIO (you will definitely need to read the documentation carefully). Note that some SYS_GPIO settings interfere with UTIL_LED_CONTROL.

Also: try composing raw byte sequences using Termite, remembering the SOF and FCS bytes. You will find that ZNP will not respond if you make a mistake; in general, do not expect to receive an error message but DO expect to receive an explicit response message. The response might be a simple “success” response, some data, or a deferred indicator/callback as for the case of a restart.

None of these ZNP messages have gone beyond the ZNP board; they are only communicating with the ZNP firmware.

Preparing the ZNP Board as an End Device

So far, the ZNP board still thinks it is a coordinator; some configuration settings must be changed.

First, though, we should give some thought to the start-up procedure. Whether we’re thinking of a PC or a MCU communicating with the ZNP board, we need to know when it is ready. Refer to [R1]. In practice this amounts to waiting for SYS_RESET_RESPONSE (command id 0x4180) before attempting to send ZNP messages.

Although this configutation is only needed once, I think it makes sense to execute these commands with a script each time I change anything. For an MCU-based device in use, I would not normally perform these steps unless a jumper or button is active at start/reset time.

The command to use is ZB_WRITE_CONFIGURATION (Simple API) and each command should cause a ZB_WRITE_CONFIGURATION_RSP response from ZNP. Refer to [R2] for the structure of the ZB_WRITE_CONFIGURATION message and response (in the doc, “SREQ” is the request and “SRSP” is the response). The values for ConfigId are hidden in section 5.3 of [R3]. Note that Z-Tool expects the configIds to be provided in decimal form!

The following should be set:

  • ZCD_NV_LOGICAL_TYPE (configid=0x87=135d), value=0x02 for end device.
  • ZCD_NV_PAN_ID (configid=0x83=131d), a value of 0xffff means “don’t care”, which is appropriate for a single PAN environment as we want the device to join any PAN it finds.
  • ZCD_NV_CHANLIST (configid=0x84=132d), value = 4 bytes bitmask of channels to use. The LSB is channel 0, so the bitmask for just channel 11 would be 0x00 00 08 00. This is the compiled default and leaving it as-is does not stop the end device connecting to a coordinator on a different channel. Setting ZCD_NV_CHANLIST to match the coordinator channel will reduce scan-and-connect-to-PAN time because the end device will try channels in ZCD_NV_CHANLIST first, and then try others. I use channel 16 = binary 1 0000 0000 0000 0000 = 0x00 01 00 00. Note that in both Z-Tool and in the raw ZNP message, the value is set in little-endian byte order, so the bytes to send (in order) for channel 16 would be 00, 00, 01, 00 (remembering to use decimal form of each byte in Z-Tool).

I also set the ZCD_NV_POLL_RATE. This modifies the interval with which the device contacts the coordinator with IEEE 802.15.4 “Data Request” packets once joined (see Wireshark sniffer). The compiled default is 1000ms. THe documentation wrongly states this is config_id 0x24 and has 1 byte length. It is actually config_id = 0x35 and is a 4 byte value (in ms) expressed in little-endian form! I use a value of 15s, which matches the PTVO firmware. Expressing in ms and in little-endian order, this requires the series of bytes: 0x98 0x3A 0x00 0x00.

A likely cause of problems when “messing about” is caused by the way ZNP stores network state in non-volatile memory. While this is a good thing for devices in real use, speeding up their re-connection, it is helpful to always have a clean start if messing about… So, for experimental work, set the ZCD_NV_STARTUP_OPTION (configid=3) to a value of 2. This will clear the network state on restart. Issue a ZB_SYSTEM_RESET command to clear state immediately.

At this point the ZNP board is still unconnected from the network. It is not joined to a PAN and you will not see any network traffic to/from it in Wireshark (or other sniffer).

Joining a PAN

This involves two steps: first declaring to ZNP what “clusters” it offers/consumes, and second is telling ZNP to attempt to join a PAN.

There are some concepts and associated names for them which need to be understood before the first step can be completed.

Endpoint is a way of identifying separate features offered by a single physical device (identified by its MAC address and two byte PAN-network address). Consider an endpoint as a “virtual” device which shares a single physical device. For example, a Zigbee device controlling three separate lights would have three endpoints.

ClusterId is a way of identifying the set of attributes which a device feature supports. A simple on/off device has a cluster id of 0x0006. There is a Basic Cluster, id = 0x0000, which all devices should support. The Z2M “interview” involves the coordinator sending requests to read attributes in the Basic Cluster such as the device model and manufacturer. See [R5] and [R6].

InCluster, OutCluster, Server, Client are very confusing terms used in the Zigbee specifications. They often seem to be the wrong way! One way of deciding what they should be is to create a device using PTVO Configurable Firmware builder and then observe the messages using Wireshark. The InCluster list contains “server” features and the OutCluster list contains “client” features. Some examples: a light is a “server”, so is set up as an input cluster; a switch is a “client”, and a temperature sensor a “server”. I see that PTVO devices put the Basic Cluster as both an input and output cluster, but thinks work just fine with it defined as only an input cluster.

The first step, declaring the clusters uses the AF_REGISTER command (use Z-Tool), with the following values:

  • Endpoint = 1 (actually a free choice, but 1 seems logical!)
  • AppProfId must be set to the Home Automation Profile ID, which is 0x0104.
  • AppDeviceId should be chosen from the values given in [R5]. I used 0x0000.
  • AppDevVer can be freely chosen.
  • Set a length of 1 for the InClusters list and add the Basic Cluster Id.

After defining the endpoint, set Z2M to accept join requests and start Wireshark (remember to set the channel to match the Z2M coordinator).

Now for the second step. This will trigger network activity and a series of ZNP callback messages to inform you of the progress of joining. The command to use is ZDO_STARTUP_FROM_APP, setting a delay of 0 (appears as “Mode” in Z-Tool). The ZNP callbacks should comprise one ZDO_STARTUP_FROM_APP_RSP (with a network status code of 1) and several ZDO_STATE_CHANGE_IND messages. The status codes given by ZDO_STATE_CHANGE_IND are not in the documentation but can be found by referring to Components\stack\zdo\ZDApp.h in your Z-Stack 1.2 installation (find the devStates_t enum). The values I see follow the pattern: 2 DEV_NWK_DISC {potentially repeated} -> 3 DEV_NWK_JOINING -> 5 DEV_END_DEVICE_UNAUTH -> 6 DEV_END_DEVICE. Z-Tool incorrectly decodes status = 2 as “INVALID_PARAMETER”. If you see status 2 arriving multiple times it means you did not set the channel list to match the coordinator, and if you only ever see status 2, the coordinator is probably off-line or not accepting join requests. Status 6 means the device has joined the PAN as an end device and it should appear in the Z2M web interface.

You will now see a bunch of AF_INCOMING_MSG messages. These are from the coordinator to the ZNP device and are caused by Z2M’s “interview”, which is trying to read the attributes in the Basic Cluster. It will eventually give up. The ZNP board is now joined into the PAN but is not properly described in Z2M and can do nothing useful. Still: getting this far is an achievement.

Aside: there are also Simple API commands for this process, but I find the responses less informative.

Message Log

Here are the raw ZNP messages sent over UART and the responses (generated by a Python script – see Part 2). The responses have been parsed out to command + body, whereas the TX bytes include the SOF and FCS.

TX: fe 03 26 05 87 01 02 a4
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 04 26 05 83 02 ff ff a6
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 06 26 05 84 04 00 00 01 00 a4
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 03 26 05 03 01 02 20
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 06 26 05 35 04 98 3a 00 00 b6
[Response] RX body: Cmd: 6605 Body: 00

And here is the AF_REGISTER and “start”.

TX: fe 0b 24 00 01 04 01 00 00 01 00 01 00 00 00 2b
[Response] RX body: Cmd: 6400 Body: 00
TX: fe 01 25 40 00 64
[Response] RX body: Cmd: 6540 Body: 01

In Wireshark, you should see an Association Request, followed (but probably not immediately) by an Association Response. Note that the protocol is IEEE 802.15.4, rather than “Zigbee”, as these are network-level interactions. See that the request and response source and destination use a MAC address and that the short address (2 bytes) which will subsequently be used (and is shown in Z2M) is provided in the response.

Further down the Wireshark log, you should see a “Transport Key” message being sent to the short address, now given as “Zigbee” protocol, followed by a “Device Announcement” where the newly-joined device broadcasts its presence.

To be Continued…

Part 2 will look at the request and response messages involved in the Z2M “interview”, and beyond to build a simple device to demonstrate reporting and remote control.

Zigbee from a DIY/Maker Perspective – Low Cost Experiments

My perspective on Zigbee is one of a DIY-er and Maker; I’m primarily interested in making my own devices, rather than doing things with commercial devices. This means, among other things, that I definitely do not want to be reliant on proprietary “hub” software and want to keep control over complexity and cost. What follows are some brief notes on the technology choices which worked for me. The XBee platform looks a bit pricey, especially if the ultimate plan is for multiple DIY devices. This post condenses quite a bit of desk research and experimentation (I didn’t already have a Zigbee network, so started from zero); I hope it provides other people with a “jump start”. I won’t explain basics, so if you don’t already know what “coordinator”, “router”, and “end device” signify in a Zigbee network, look here.

The E18 modules from EByte are available in a number different models and I opted for the quite economical E18 MS1 PCB (this has a PCB antenna, whereas the similarly-priced E18 MS1 IPX has an IPX socket for an external antenna).  Current (July 2023) prices on Aliexpress for E18 MS1 PCB, are: £2.61 or $3.18 + tax. Other options are available with power amplifiers, but for experimenting, the MS1 is just fine and I’ll only pay the extra (and suffer lower battery life) if I find a need. When I say E18 below, I will be particularly referring to the E18 MS1, although the comments may apply to other models.

The basic E18 module is in the form of a surface mount “stamp”, so I made some breakout boards to expose its many GPIOs. I made lots, so am selling on ebay. I’ve written a separate post about these boards, which contains a link to the ebay listing and to a github repository with Eagle CAD files (if you want to get your own boards made).

The big challenge with using the E18 is firmware. The factory firmware is quite specific and limited in what it can achieve, although it takes quite a while to work this out from studying the manual; it is intended for “transparent” data exchange between devices. This is not a good match for the kind of home automation devices which I’m interested in, and which I suppose most DIY/Maker types are. Another issue is that it is impractical for hobbyists to compile source code for the CC2530, the Texas Instruments system-on-chip which is the heart of the E18, because you need the IAR IDE costing several thousand dollars. However: the good news is that someone has produced a configurable firmware generator: PTVO. Another option is to use the Z-Stack “Zigbee Network Processor” (aka ZNP). This also avoids the need to compile for the CC2530 but is nowhere near as easy to use as the PTVO configurable firmware generator, so I will write a separate article on using ZNP.

PTVO Configurable Firmware is extremely helpful in that you set up how the GPIOs should be used as inputs and outputs, including use of multiple I2C sensors and more. It does all this without needing you to use a separate microcontroller; a single E18 (or similar) module interfaced to switches, sensors, indicators, etc is all you need. There are a few things which the free version cannot do, but the “premium” version is not at all expensive. Practical battery-powered end devices will require the premium version for the power saving mode; without this, I have measured PTVO firmware devices consuming over 70mA continuously.

The PTVO tool can also generate the converter JS files needed by Zigbee2MQTT (see below), which is very helpful indeed, since the documentation on writing these youreself is rather sketchy. While these converter files seem to mostly work “out of the box”, the one generated for a basic switch is not compatible with the firmware which is generated. I raised an issue on github but the developer didn’t seem to consider incompatible output to be a bug. One limitation of the PTVO firmware is that you can only use the devices which it supports and in the way it supports them (e.g. you cannot control over-sampling and other noise-reduction techniques). If these are issues then a resort to ZNP and a MCU with your own code to interface with the sensor/display/actuator is the easiest way ahead.

In addition to the E18s, which will be the heart of my Zigbee devices, I also obtained:

  • The Ebyte E18 USB dongle (CC2531 chip), model E18-2G4U04B, for use as a Zigbee message sniffer. The pre-installed firmware works with the Texas Instruments Sniffer software (version 1, not version 2!) but this is not much use as it cannot decrypt encrypted messages. So: I installed the ZBOSS sniffer firmware and use the ZBOSS extension for Wireshark (see setup notes on Zigbee2MQTT site). This works mostly nicely and is quite informative. You DO need to select the right networking channel (the default in Zigbee2MQTT is 11, but I changed to 19 to reduce WiFi interferrence). Note that this hardware is not well suited to running the Zigbee 3 coordinator firmware or supporting many devices, and is quite low power (specifics are documented on the Zigbee2MQTT site).
  • A bundle comprising another CC2531 USB dongle (but this one with an external antenna, and intended to be the network coordinator), a CCDebugger and associated connection cables AND an adapter PCB with 0.1″ and 0.05″ headers. There are plenty of these on ebay, Aliexpress, and Amazon. I installed the Z-Stack 1.2 HA coordinator firmware on this dongle using the CCDebugger and TI Flashing software (this is where the 0.1″ to 0.05″ adapter board and cables come in). I also found that the Ebyte dongle had a single row of 0.05″ spacing holes for programming and was able to slide these over 5 of the 10 pins on the adapter board and then make ad-hoc connections to the programmer. This worked first time, and is much cleaner than solding “Du Pont” cables as suggested on the web.

On the question of programmers: I think the CCDebugger is the easiest way and, when bundled with the adapter and dongle is a “no brainer” purchase if you’re getting set up from scratch. I have also used other options with a Raspberry Pi, Flash-CC2530 (see my other post), and if you conduct a web-search for “CCLoader” + “ESP8266” or “Arduino”, you will easily find ways of using those MCU platforms as programmers. Some of these are for the CC254x devices, but the communication protocol is the same. Use “.hex” (Intel Hex) files with TI SmartRF Flash Programmer and “.bin” files with the other options. If you need to create bin from hex files, try using hex2bin (noting that you may need to use padding pad). The CCDebugger and Flash-CC2530 also show you the chip id, which can be helpful to find out whether there is a CC2530 or CC2531 inside (or not). Chip Id helped discover that multiple BLE modules I bought had mis-labelled chips as well as incorrect firmware: total fakes! Whatever you use for programming, only the DD, DC, and Reset signals are needed (in addition to Gnd and supply). If you use the CCDebugger, note that it provides a 3.3V power supply but also has a Vsens (read the manual!). I tend to program “out of circuit”, using the CCDebugger’s power supply and simply connect Vsens to the debugger’s power supply.

Now to the important question of the coordinator, and the bridge between planet-Zigbee and the rest of the universe. Zigbee2MQTT is freely available and non-proprietary and has some nice features:

  • It provides a bridge to a MQTT broker. This is a smart move for integration of different sense-and-control technologies. In my case, I already have a MQTT broker running as part of my Open Energy Monitor (inside it has a Raspberry Pi3) and I have co-opted the broker to be a hub for several air quality and temp+humidity sensors around my house and garden based around the ESP8266 Wifi MCU. There are numerous PC and Android (and doubtless Apple) Apps which can subscribe and publish to MQTT and I’ve written several Python dashboards using Plotly Dash and the Paho MQTT package. Using MQTT allows you to integrate Zigbee and non-Zigbee “worlds”, although the interactions will not be as rapid as a pure Zigbee system. If you know nothing about MQTT: find out!
  • It can run on a Raspberry Pi v3 or v4 (but not my old v1). I run it in a RPi v3 with its WiFi turned off, and have not had to resort to a USB extension cable for the coordinator dongle (some people recommend such an extension to avoid radio frequency interferrence).
  • It has a nice web interface which can be used to monitor the Zigbee network, inspect device status, control devices, and “bind” devices together.
  • Kantor Koen (the brains behind Zigbee2MQTT) provides firmware for the coordinator.
  • Mostly good documentation, although I could not get it to install on Windows 11 by following the instructions and the documentation on writing converter JS files is not adequate (not that this is a problem when the PTVO JS files work).

The question of radio frequency interferrence is referred to on the Zigbee2MQTT website, forums, and elsewhere but much of the information is unhelpfully inaccurate due to omission of caveats. The core issue is that WiFi and Zigbee share the same frequency band. I have widely seen that the assertion that there are only three non-overlapping WiFi channels: 1, 6, and 11. The central neglected caveat is that in the USA, unlicenced use is limited to channels 1-11. Much of the rest of the world can use channels 12 and 13. So, while 1, 6, and 11 are the sensible choice for optimal separation in the USA, there are actually 4 non-overlapping channels elsewhere: 1, 5, 9, 13 (a separation of 4 channels is quite adqeuate). This gives scope for operating 3 WiFi channels AND having uncontested space for a Zigbee network. NXP also has an application note (pdf) which shows the results of experiments on WiFi/Zigbee interference which suggests that concerns about wide “sideband” noise are not really warranted if you are using “802.11n” WiFi.

Summary

Ebyte E18 USB (CC2531) dongle and ZBOSS for a Zigbee sniffer.

Another CC2531 USB dongle with an antenna running Z-Stack 1.2 HA coordinator firmware attached to a Raspberry Pi v3. This is a bit low powered for serious use, but is fine for experimenting, prototyping, and small-scale use; I’ll upgrade if needed.

E18 modules on my breakout PCB runing PTVO firmware as “terminal” devices.

[optionally also E18 modules with PTVO “router” firmware, although I might end up using yet more USB CC2531 dongles as routers on the grounds that they are easy to power]

Zigbee2MQTT running on a Raspberry Pi v3 (in my case I used the MQTT broker on my Open Energy Monitor, rather than running on the same Pi).

For monitoring and publishing MQTT while experimenting I tend to use MQTT Explorer on my Windows PC. There are several Android apps which can do plotting, dashboards, and control widgets; some are better than others, depending on what you want…

 

 

Sonoff Basic R3 – DIY Switch via HTTP POST of JSON

The latest firmware of the Sonoff BasicR3 Wifi Switch now allows the device to be controlled through the DIY mode built into its firmware (v3.6.0) without any re-flashing with Tasmota firmware or fooling about with jumpers. The only obvious limitation for some people is that the “eWeLink” app and DIY access cannot both be used together; the switch is either in DIY mode or app-control mode. This is not an issue for me as I have no requirement for remote control from outside my LAN and don’t intend to use the app at all. In any case it would not be hard to either use a general purpose HTTP “shortcuts” app (e.g. HTTP Shortcuts!) for control from a phone from my LAN, or to write/run my own web app of a Raspberry Pi. I begin to digress… the point of this post is to make some notes for myself in a way which someone else might find useful (and adding some extra helpful comments). Much of what follows can be found on the web, but there are some holes, and I didn’t find the documentation quick and easy in parts.

Getting into DIY Mode

The key is to get the device into what the Sonoff User Manual calls “Compatible Pairing Mode”. The push button is used in two stages, with long (5s) holds first to get the LED signalling “..-” and then a continuous rapid flashing.

The switch is now acting as a WiFi access point with SSID (name) of “ITEAD-{device_id}”, where device_id will be something like “100136da83”. Connect to this network and then visit http://10.10.7.1 to access a web page which allows you to enter your own WiFi SSID and password. After a short while, the Sonoff device will have closed its access point and connected to your local network. It is now in DIY mode, and will stay this way when its power supply is disconnected and reconnected. The LED will signal “..”. You can still use the eWeLink app to add the switch to its device list, but this cancels the DIY mode. Also: once in DIY mode, you cannot access http://10.10.7.1, no matter how many times you force the device into making “ITEAD-{device_id}” available. The exception to this is if you power the switch up so it cannot connect to your local network. In which case, the device will eventually (20s) give up trying to connect and become eligible to be forced into Compatible Pairing Mode again.

Finding the Device on the Network

This is not as simple as it might be; it would have been neater if the http://10.10.7.1 web page reported its network name before disappearing. This is a bit messy.

Option 1 is to look at the client list in your router admin web interface. This is likely to require the “advanced” option. In my case, my modem-router does provide a client list, but rather confusingly the client name which is reported, and the MAC address have no obvious connection with the device_id revealed above. The client name is like “ESP_80DD76”, which is the standard way an ESP8266 (etc) chip reports its name, which takes the last 3 bytes of the MAC address prefixed by “ESP_”. If you can see something like this, you’ve found the IP address and can progress to the “Control the Switch” section, below, although you will have to trust that the default port is used (I see no reason why it would not be).

Option 2 is to use an mDNS browser. The essence of mDNS is a service which links device names to IP addresses and other info. MS Windows doesn’t seem to have an easy way to do this, but if you also run a Raspberry Pi it is not hard. Mac users can use Bonjour, there are Android Apps, and linux users (of course) can do as follows anyway. The service/tool for the Pi is Avahi. This should already be up and running, which can be confirmed using “service avahi-daemon status”. The mDNS browser requires the avahi-utils package, which was not on my Pi, but installed with “sudo apt-get install avahi-utils”. The simple command is not “avahi-browse –all –resolve”, which will reveal that the mDNS name is of the form “eWeLink_{device_id}.local”. The port entry will report which TCP port should be used to connect to the device, which should be the default of 8081. The txt entry reveals the device state if in DIY mode, or a base-64 string of gobbledy-gook if in eWeLink app mode.

Option 2a is an Android App. I tried one called Service Browser, but there are other mDNS browsers

Option 3 is to take advantage of the observation that the mDNS name is of the form “eWeLink_{device_id}.local”. This is fine so long as your method of controlling the switch can work with mDNS names, which Postman cannot (at present).

Caveat: there is no guarantee that your router will assign the same IP address. In my case, I operate a router which allows “address reservation”, so that several IoT devices I have get known IP addresses. Unless your router has this feature, beyond tinkering, you will have to resort to using mDNS-aware software. If writing your own in Python, it looks like the “zeroconf” package is suitable for dealing with mDNS names and not worrying about reliable IP addresses.

Control the Switch

I use Postman for manual testing, and Python for anything automated.

The documentation for interacting with the device is fairly good, and can be found at https://sonoff.tech/sonoff-diy-developer-documentation-basicr3-rfr3-mini-http-api/. This includes info on how to perform an OTA (“over the air”) firmware update, control how the device behaves when its power is restored (allowing you to choose whether it starts up off, on, or in its previous state), and to change the WiFI SSID and password which the device connects to, but I’m content with the info, on, off, and pulse options.

Note that all messages to the device are HTTP (not HTTPS) POST operations, with a JSON body (payload). Although the documentation includes “deviceid” in the JSON, this is completely redundant and can be omitted.

Summary:

Operation URL JSON Body
Info http://{{ip_addr}}:{{port}}/zeroconf/info
{
 "data": {}
}
On http://{{ip_addr}}:{{port}}/zeroconf/switch
{
  "data": { 
    "switch": "on" 
  }
}
Off http://{{ip_addr}}:{{port}}/zeroconf/switch
{
  "data": { 
    "switch": "off" 
  }
}
Pulse http://{{ip_addr}}:{{port}}/zeroconf/pulse
{ 
  "data": { 
    "pulse": "on", 
    "pulseWidth": 2000
  }
}

Pulse changes the behaviour of “On” operations; if Pulse is on then an “On” operation will only last for pulseWidth milliseconds, after which the device will turn off again. The maximum duration is 1 hour. The JSON payload for Pulse can omit the “pulseWidth” to enable/disable the feature, but it is not possible to change the pulse duration by only specifying “pulseWidth”.

Getting to Grips with the HLK-LD015 Human Presence (Radar) Sensor

The HLK-LD015 is a 5GHz radar-based device to detect the presence of human beings, and similar moving objects, produced by Hi-Link. The documentation is a little sparse, especially with regards to the UART configuration procedure/commands (not even the baud rate is given!), but the device is ridiculously inexpensive. I paid £1.40 + tax and shipping from an AliExpress seller and the Hi-Link website price is just $1.65. After some sleuthing, I discovered that the “dist_radar_setting_tool” under Downloads on the LD010 prduct page will work with the LD015, although it isn’t clear whether this covers all the options. To use this, you will need something like a USB-Serial converter (I used by old FT232 breakout). Note that the board requires the use of 5V supply and logic. Remember to connect the sensor TX to the converter RX, and vice versa.

I suspect these are made for automatic light control as there is a photodiode and the output is a single I/O pin, and there is no UART output, unlike some of the higher end sensors.

The photodiode is supposed to suppress sensing if there is over threshold light, but the documention is not clear about whether this is enabled, and mentions enablement in software while not explaining that (I am still uninformed on this topic).

The setting tool is branded as “Airtouch” and the chip on the LD015 bears the Airtouch logo and the part number “5810S 2038DA”. Referring to the Airtouch website, this seems likely to be the AT5810 chip. Unfortunately, there is no useful data there.

Observations Using the “Radar Setting Tool”

The setting tool is a fairly thin wrapper around sending UART style serial messages to the sensor. See below for my decoding of the serial message structure.

The baud rate is 9600.

The pertinent settings for practical device use are:

  • “Distance”, although this seems not to be a distance measure at all; “threshold” might be better. A value of 0 gives the most sensitive performance, with switching occurring up to around 6m distance. Increasing the value quickly decreases sensitivity and by 6, you have to be right close! Oddly, values of 255 seem fairly sensitive, while the drop-down list has values 0-15. There is no documentation on what the values mean, or the valid range, but a “write” followed by a “read” returns the same value.
  • “LightOnTime” controls the duration (seconds) the OUT pin remains high after detection.
  • “Lux THR” appears to have no effect and a write followed by a read DOES NOT return the written value (a 0 is always returned).
  • It is also possible to enable/disable the radar, and to manually turn the OUT pin on and off. These might be useful for a practical micro-controller + sensor set-up.

Settings appear not to be saved across power-down.

Testing sensitivity was not quite as easy as initially expected, with the triggering distances often seeming to change for the same distance setting. I suspect this is due to the algorithm used to avoid false detection, and it SEEMed to be the case that the sensitivity was a little better just after a bit of fairly close approach (within 2m), lasting a little while and then tailing off. The device will trigger both walking towards and away from the sensor, but it is strictly a movement sensor.

Sniffing the Serial Messages

I used a Picoscope with UART decoding to capture and decode the signals sent by the setting tool and returned by the device.

In the following, the messages are all shown as hexadecimal.

The messages are framed as: {2 bytes of preamble} + {1 byte giving length} + {message} + {1 byte checksum}.

The preamble for messages to the sensor is 555A, while that from the sensor is 55A5.

The length is of the message + the checksum.

The checksum is computed by summing all the bytes [preceeding it] and using the least significant byte (i.e. sum taken modulo 256).

The message is composed of 1 command byte and from 0 to two data bytes. Double-byte data is sent least-significant byte first.

The “light on” (OUT pin high) and off command is 0A, and a single byte of 00 or 01 turns the output off or on, respectively. e.g. 55 5A 03 0A 01 BD turns the output on.

The radar on/off command is D1, and the pattern is the same. e.g. 55 5A 03 D1 00 83 turns the radar off.

The command bytes for “distance”, “light on time”, and “lux thr” vary according to whether a read or write is being undertaken. Distance has one data byte, while the others have two bytes of data.

  • Read distance command is 03 and write distance command is 02
  • Read light on time command is 05 and write is 04
  • Read and write lux thr commands are 07 and 06, respectively

Example: read light on time requires a send of 55 5A 02 05 B6 (note length = 2 this time) and leads to a reply of 55 A5 04 05 2C 01 30 (note length = 4 bytes and that the data is 012C hex = 300 decimal).

General Conclusions

This would make for quite a usable sensor for use with a 5V system such as the “traditional” Arduinos. The range is rather limited and the simple on/off output rather limiting, but for a smaller room or located close to a doorway, this would work well.

Using Websockets with EmonPi MQTT Broker to Create a “Live Feed” Dashboard

This is not a complete description of the background tech; there is plenty of info on the web about websockets, mqtt, and the javascript libraries.

The motivating idea behind this experiment is to be able to have a live-updating dashboard with the minimum set of dependencies of the “install X” kind. Since the EmonPi aleady has a MQTT broker, it provides the basis for feeding data to the dashboard. Websockets allows a “web page” (it can be a file viewed in your web browser) to send and receive data, and update the page, using JavaScript. This is not particularly difficult, but there are several steps which took a couple of evenings to research and put into practice, so here are some notes (for myself in the future, and anyone else who finds it!).

If you are looking for some “homework reading”, a decent place to start is Steve’s Internet Guide.

Make Mosquitto Listen for Websockets Connections

Mosquitto is the MQTT broker on the EmonPi. It is configured to listen for connections which employ the “mqtt:” protocol. It is possible to add a websockets listener (“ws:” protocol), with the conventional port 9001 assignment as follows:

  • SSH onto the EmonPi and navigate to /etc/mosquitto.
  • Modify the mosquitto.conf file to add “listener 9001” and “protocol websockets”, see below. I have also added explicit lines for the default mqtt listener on port 1883 in the interest of clarity, although I believe they are not required. You will need to use “sudo”. Alternatively, it should be possible to add a file to the “include_dir”.
  • Restart mosquitto using “sudo systemctl restart mosquitto.service”
pid_file /var/run/mosquitto.pid
persistence false
persistence_location /var/lib/mosquitto/
log_dest file /var/log/mosquitto/mosquitto.log
include_dir /etc/mosquitto/conf.d

listener 1883
protocol mqtt

listener 9001
protocol websockets

allow_anonymous false
password_file /etc/mosquitto/passwd
log_type error

It should now be possible to test two things: that the existing mqtt protocol listener, which is relied on to service data to EmonCMS, and that the websockets listener us “up”. I used “MQTT Explorer”, a free and simple client, which should be set up to not validate a certificate and to have an empty “Basepath” (it defaults to “ws”).

Create the Websockets Dashboard with HTML and JavaScript

My primary aim for this experiment is to be able to co-opt EmonPi to broker air quality data from a home-brew particulate matter, VOC, NOx, CO2 sensor combo, but I’m using the existing emon data to demonstrate the concept, which boils down to “guages” using the Google Charts toolkit, and scrolling line charts using Chart.js.

Here is the code to hack about with, based on snippets from various places, with modifications and updating to a recent. It should just live in a plain text file with a “.html” extension, and can be opened in your web browser. It is not beautiful but demonstrates the concept. There is some logging to the “console”, which is where error messages will also appear. Hit F12 on Firefox or Edge (or Chrome too I think) to find the console.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <!-- This helps with viewing on mobile devices -->
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <title>Real-Time Charts</title>
    <!-- Google charts -->
    <script type="text/javascript" src="https://www.gstatic.com/charts/loader.js"></script>
    <!-- Chart.js -->
    <script src="https://cdnjs.cloudflare.com/ajax/libs/Chart.js/3.9.1/chart.min.js" integrity="sha512-ElRFoEQdI5Ht6kZvyzXhYG9NqjtkmlkfYk0wr6wHxU9JEHakS7UJZNeml5ALk+8IKlU6jDgMabC3vkumRokgJA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
    <!-- The Paho Javascript for MQTT over Websockets -->
    <script src="https://cdnjs.cloudflare.com/ajax/libs/paho-mqtt/1.0.1/mqttws31.min.js" type="text/javascript"></script>
    <!-- Bootstrap CSS - can be removed but will help with the styling -->
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.2.3/css/bootstrap.min.css" integrity="sha512-SbiR/eusphKoMVVXysTKG/7VseWii+Y3FdHrt0EpKgpToZeemhqHeZeLWLhJutz/2ut2Vw1uQEj2MbRF+TVBUA==" crossorigin="anonymous" referrerpolicy="no-referrer" />
    <link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='main.css') }}">
</head>
<body>
<script type="text/javascript">
    // load the google charts stuff and only after it is loaded should we start to set up the data acquisition
    // otherwise, we find the charts library is being called upon before it exists!
    google.charts.load('current', {'packages':['gauge']});
    google.charts.setOnLoadCallback(doWhenReady);

    function doWhenReady() {
        // Create a client instance. NB clientid SHOULD BE DIFFERENT between browser clients; the following should work fine in a home environment.
        var clientId = "client-" + Date.now().toString();
        client = new Paho.MQTT.Client("192.168.1.1", 9001, clientId);
        // set callback handlers
        client.onConnectionLost = onConnectionLost;
        client.onMessageArrived = onMessageArrived;
        // connect the client
        client.connect({onSuccess:onConnect, userName: "emonpi", password: "emonpimqtt2016"});

        // called when the client connects
        function onConnect() {
          // Once a connection has been made, make subscriptions
          console.log("onConnect");
          client.subscribe("emon/emonpi/vrms");
          client.subscribe("emon/emonpi/power1");
        }

        // called when the client loses its connection
        function onConnectionLost(responseObject) {
          if (responseObject.errorCode !== 0) {
            console.log("onConnectionLost: " + responseObject.errorMessage);
            document.getElementById("status").innerText = "Lost connection due to: " + responseObject.errorMessage;
          }
        }

        // called when a message arrives. Note that the full topic name is aka "destinationName"
        function onMessageArrived(message) {
          document.getElementById("status").innerHTML = "";
          console.log("onMessageArrived: " + message.destinationName + " : " + message.payloadString);
          if (message.destinationName == "emon/emonpi/vrms") {
            // "+" converts string to numeric, which is then rounded by toFixed(), which returns a string!
            var vrms = (+message.payloadString).toFixed(2);
            // Google chart guage
            vrmsGaugeData.setValue(0, 1, +vrms);
            vrms_gauge.draw(vrmsGaugeData, vrmsOptions);
            // scrolling line chart.js
            if (vrmsChartData.data.labels.length === 20) {
                vrmsChartData.data.labels.shift();
                vrmsChartData.data.datasets[0].data.shift();
            }
            // we don't have the time in the mqtt message!
            var timestamp = new Date().toLocaleTimeString();
            vrmsChartData.data.labels.push(timestamp);
            vrmsChartData.data.datasets[0].data.push(vrms);
            lineChartVrms.update();
          } else if (message.destinationName == "emon/emonpi/power1") {
            var power1 = message.payloadString;
            // Google
            power1GaugeData.setValue(0, 1, +power1);
            power1_gauge.draw(power1GaugeData, power1Options);
          }
        }

        // ------------- Google Charts -----------
        var vrmsGaugeData = new google.visualization.arrayToDataTable([
            ['Label', 'Value'],
            ['Vrms', 240]
        ]);
        var power1GaugeData = google.visualization.arrayToDataTable([
            ['Label', 'Value'],
            ['Power', 0]
        ]);
        var vrmsOptions = {
            min        : 200,
            max        : 260,
            minorTicks : 5,
            greenFrom  : 230,
            greenTo    : 250,
            majorTicks : ['200', '210', '220', '230', '240', '250', '260']
        };
        var power1Options = {
            min        : 0,
            max        : 5000,
            minorTicks : 5,
            majorTicks : ['0', '1kW', '2kW', '3kW', '4kW', '5kW']
        };

        var vrms_gauge = new google.visualization.Gauge(document.getElementById('vrms-gauge'));
        var power1_gauge = new google.visualization.Gauge(document.getElementById('power1-gauge'));
        vrms_gauge.draw(vrmsGaugeData, vrmsOptions);
        power1_gauge.draw(power1GaugeData, power1Options);

        // chart.js
        var chartOptions = {
            responsive: true,
            tooltips: {
                mode: 'index',
                intersect: false,
            },
            hover: {
                mode: 'nearest',
                intersect: true
            },
            scales: {
                xAxes: {
                    display: true,
                    scaleLabel: {
                        display: true,
                        labelString: 'Time'
                    }
                },
                yAxes: {
                    display: true,
                    scaleLabel: {
                        display: true,
                        labelString: 'Value'
                    }
                }
            }
        };

        // this allows us to have common chart options, while varying the scale in each chart
        var vrmsChartOptions = structuredClone(chartOptions);
        // use sensible range, which will be expanded if the data goes outside "suggested"
        vrmsChartOptions.scales.yAxes.suggestedMin = 230;
        vrmsChartOptions.scales.yAxes.suggestedMax = 250;

        var vrmsChartData = {
        type : 'line',
        data : {
            labels: [],
            datasets : [{
            label : 'Vrms',
            backgroundColor : 'rgba(255, 136, 0, 0.5)',
            borderColor : 'rgba(255, 136, 0, 1.0)',
            fill: true,
            data : []
            }]
        },
        options : vrmsChartOptions
        };

        const lineChartVrms = new Chart('vrmsChart', vrmsChartData);
    }
</script>

<h1>EmonPi Live Feed</h1>
<div class="container">
    <div class="row">
        <div id="vrms-gauge" style="width: 120px; height: 120px;"></div>
        <div id="power1-gauge" style="width: 120px; height: 120px;"></div>
    </div>
    <div class="row" id="status">loading data...</div>
    <div class="col-10">
    <div class="card">
        <div class="card-body">
            <canvas id="vrmsChart"></canvas>
        </div>
    </div>
</div>
</div>
</body>
</html>

Don’t forget to change the IP address for the MQTT broker!

Serving the Dashboard

While the above HTML + JavaScript works fine as a file on your PC (so long as you have a network connection to acquire all those JavaScript libraries from), it can also be placed on the EmonPi.

I have opted to create a separate area from EmonCMS in the interest of avoiding too much risk of cock-up, future muddle, etc… The files in this separate area will all be “static” in the sense that they are just served to users as they are (no PHP etc). The dashboard page above IS static in this sense, since the JavaScript executes on your PC inside the web browser; all the EmonPi does is send it.

EmonCMS uses the Apache2 web server, so it is easy to make it listen on a different port (EmonCMS uses port 80, and I have chosen port 8001) using the following commands (in a SSH).

Create the place where the HTML file will live

cd /var/www
mkdir static

Check the directory ownership is correct using “ls -la”, which should show “drwxr-xr-x 2 pi pi 4096 Nov 27 23:02 static”.

If necessary “chown pi:pi static”.

Place the HTML + JavaScript file here (I suggest using the scp command, for example “scp dashboard.html pi@192.168.1.1:/var/www/static”).

Change the Apache2 config

There are two things to change, first to make apache2 listen on port 8001, and secondly to tell it to use the files in /var/www/static in connection with port 8001 requests.

cd /etc/apache2
sudo nano ports.conf

Add a single line “Listen 8001”, then save and exit.

For the second step, enter the “sites-enabled” directory. I chose to copy the existing emoncms.conf file, naming the copy “static.conf” and editing it to contain:

<VirtualHost *:8001>
  ServerName localhost
  ServerAdmin webmaster@localhost
  DocumentRoot /var/www/static

  # Virtual Host specific error log
  ErrorLog /var/log/static/apache2-error.log

  <Directory /var/www/static>
    Options FollowSymLinks Indexes
    AllowOverride All
    DirectoryIndex index.html
    Order allow,deny
    Allow from all
  </Directory>
</VirtualHost>

The changes are fairly obvious, but note that I have also added “Indexes” against “Options”. This makes apache give a listing of all the files in “static” when I use the URL “http://192.168.1.1:8001”. If you only have one dashboard, then simply name it “index.html” and it will appear when that URL is used.

Make a directory for logs

Otherwise apache2 will not restart.

cd /var/log
mkdir static

Restart Apache2

To load the new config.

sudo systemctl restart apache2

Make sure EmonCMS is still working, then visit your new site! It should work adequately on a mobile phone, once rotated to landscape format.

 

Using MQTT in Open Energy Monitor to Capture External Device Data

I struggled to find information about this using web searches, so here is a condensed “how to”. The scenario I have is using a home-brew ESP8266 based device attached to my solar PV inverter which I want to relay definitive power output to my Open Energy Monitor via MQTT. This seems quite simple in principle, and is simple in practice, but seemingly not well documented.

First thing is to publish the data to an MQTT broker on the emonpi with a topic which starts “emon/{source}/{key}”, replacing {source} with a suitable name for the data source and {key} with the attribute name for the data being sent. In my case, I used “emon/solis/power” as the data source is the output power for my Solis PV inverter. The message payload is simply the data value. This immediately makes the published data appear on the “Inputs” screen of EmonCMS.

Two refinements/possibilities:

Send JSON

Rather than just a single value, send several in the same message. There are two ways to do this:

a) use a topic of form “emon/{source}”

b) use a topic of form “emon/{source}/{key}”

If the message payload sent to the MQTT topic “emon/solis/power” looks like {“ac”: 90, “dc”: 120}, option (a) creates EmonCMS inputs “ac” and “dc” under “solis”, while option (b) creates inputs called “power_ac” and “power_dc”.

Include a timestamp

To do this, simply include an extra field in the JSON called “time”, with a value which is the Unix time. If you are testing, the Unix time needs to be fairly close to the actual time (ignoring summer time) otherwise EmonCMS will indicate “inactive”, but it still captures the data.


Aside: I used the VSMqtt plugin for VSCode as the MQTT client, as I’m using the PlatformIO plugin to develop my ESP8266 code (using the Arduino libraries). Update: these days I’ve changed to using MQTT Explorer as my go-to software for viewing MQTT messages.