Using Z-Stack ZNP to Make DIY Zigbee Devices to Work With Zigbee2MQTT

Z-Stack ZNP (Zigbee Network Processor) allows you to make a Zigbee device by running ZNP on the Zigbee chip (CC2530 etc) and a microcontroller to read sensors, actuate motors, display stuff etc, using a basic communication protocol (UART serial or SPI) between the two. This saves you from having to compile code for the Zigbee chip (which requires a costly IDE for the CC2530) and means you can, for example, work with user-friendly libraries such as Arduino (whether for Arduino hardware, ESP8266, etc…).

Unfortunately, the documentation for ZNP is not great. Information is spread over several documents. There are errors. There is a distinct lack of information which “walks you through”. This article is my notes on how I got things to work, with a few caveats: I am working with a Zigbee2MQTT environment running @KoenKK’s Z-Stack HA 1.2 coordinator firmware and am explicitly working only within the scope of the Zigbee 1.2 Home Automation Profile on a CC2530 device. I see no reason why what follows could not be applied to Zigbee 3 or to chips other than CC2530 with a few tweaks.

Getting Started

Download Z-Stack Home 1.2 from the Texas Instruments Z-Stack archive (it should be the one named “Z-STACK-HOME — ZigBee Home Automation Solutions“, with an installer file name of “Z-Stack_Home_1.2.2a.exe”. This provides you with the Z-Tool and some documentation. Do not use the firmware included! There is source code, which can sometimes be useful to work out details which the documentation does not explain.

Download the @KoenKK Home 1.2 coordinator firmware for your hardware (CC2530 etc). This is almost the same as the ZNP firmware available in the TI Z-Stack download but has a few modifications and different compiler flags set. These differences are essential! Although this is designated as “coordinator”, it is possible to change the role to end-device using ZNP. This is what we’ll do.

I’m using my E18 boards (see my E18 breakout post), but any basic development board based around the same CC2530 chip should work just the same, so long as it exposes the UART pins (see below). Ebyte sell a dev board with an E18 already mounted and an onboard USB-serial converter (but you will need to use jumper wires to connect the the CC2530 UART I/O to this). I will use “ZNP board” to refer to this hardware.

Flash the ZNP board with the @KoenKK firmware.

I’ll be using Zigbee2MQTT (hereafter “Z2M” and the ZBOSS/Wireshark sniffer as I outline my DIY Zigbee post. The sniffer is very useful for observing the sequence and content of messages and will helpfully show when you try to send malformed messages (these are often not reported by ZNP, which cheerfully responds with “success” messages).

For the purposes of exploration and learning, I use a PC to communicate with the ZNP board, rather than a micro-controller. This makes it much easier to experiment, while knowing that the same UART messages send from a MCU would have the same effect.

The system outline is: [PC – USB/Serial adapter] << UART serial >> [ZNP board] << Zigbee wireless >> [Zigbee coordinator dongle] << serial tty >> [Zigbee2MQTT].

At the PC end, the ZNP commands can be generated and sent “as you please”. I used Z-Tool (see below), some Python code, and (mostly just to show it worked) Termite with the “hexadecimal view” filter, my favourite serial terminal. ZNP commands are just a bunch of bytes!

While the TI ZNP firmware is compiled to require hardware flow control (which Arduino libraries do not have), and to require configuration (SPI vs UART) via a hardware pin, the @KoenKK does not (see the “patch” file in the github repo if you are interested to see the changes). It only supports UART comms and uses P0.2 for RX and P0.3 for TX. Remember to cross-over TX and RX between the USB-serial adapter and the ZNP board. It also exposes four GPIOs via ZNP on P0.6, P0.7, P1.6, and P1.7.

Unfortunately, the UART comms option doesn’t come with any power saving support, whereas the SPI option does (although how effective it is I do not know). The bottom line is that the ZNP board seems to draw a continuous 23mA approx, which would be hopeless for a battery device. For a sensor/output-oriented device it would be feasible to use a MOSFET to kill the power to the E18 before putting the MCU to sleep, but devices which should be remote-controlled or queried will just have to be mains powered. I don’t suppose this is such a big deal as such devices are likely to have more energy consumption than a battery is likely to be suitable for.

Note: I will use “Termite” to mean “use the serial terminal of your choice” and “Wireshark” to mean “use the Zigbee sniffer of your choice”.

Documents

The main references which I found to be useful, specifically those relevant to HA 1.2, are:

[R1] “Z-Stack ZNP Interface Specification” – outline of ZNP, esp the physical interface (for various SoCs and devkits) and message frame structure.

[R2] “Z-Stack Monitor and Test API” – basic message and response structure for commands shown in Z-tool (except the Simple API). This is the core reference for ZNP message structures.

[R3] “Z-Stack Simple API” – general description of Simple API + index of configuration ids.

[R4] “Developing a ZigBee System Using CC2530 ZNP” – this contains a very useful walkthough of ZNP commands but doesn’t explain much. (Lamentably, TI seem to have removed this from their online document library.)

[R5] “ZigBee Home Automation Application Profile” – contains prescribed profile Id, device Ids, and cluster Ids applicable to HA 1.2. There is also a more recent Zigbee Cluster Library specification [R6] which is applicable to Zigbee 3.0 (and will surely have incompatible elements alongside stuff which works!). HOWEVER: this HA profile does not document the ZCL frame structure or the Basic Cluster attributes; there is presumably a Z1.2 document for these… but [R6] seems good enough. (Lamentably, the Zigbee Alliance, now CSA, no longer make this available. Bad people!)

[R6] “07-5123-08-Zigbee-Cluster-Library” – revision 8 of the Zigbee Cluster Library, aka “ZCL”.

[R1] to [R3] may be found in the Documents folder of the Z-Stack installation.

First Test – Termite and Z-Tool

Having connected P0.2 and P0.3 as noted above, connect with Termite at 115200 baud (at this point it is best not to have the sniffer dongle also attached, so you can easily see which the correct serial port is). Hit the ZNP board reset button and watch for its reset indication message to appear in Termite. This message is called “SYS_RESET_RESPONSE” (but some of the documentation calls it “SYS_RESET_IND”). Note that the message is framed: there is a “start of frame” (SOF) byte 0xfe, followed by the message body, and terminated by a single “FCS” checksum byte. The checksum uses the XOR8 algorithm (look it up). The message body starts with 1 length byte, followed by a 2 byte command and then a variable length of data. Review the message alongside the documentation for the message frame structure and the SYS_RESET_RESPONSE message type. Get some paper and check the checksum! Note that the command id (0x4180) is useful for looking up in the documentation when the “friendly” name has been changed.

The same message structure is the basis for all ZNP messages.

Close Termite’s connection to the serial port and start up Z-tool (it can be found in the Tools folder of the Z-Stack installation). Unfortunately, Z-tool defaults to the wrong baud rate. Look in Tools > Settings and change it to 115200 for the serial port in question. The setting persists for each serial port.

It should scan for and find your device (use Tools > Scan for devices to repeat the scan). It can take a while for the ZNP board to become visible on a cold start, but it should come up quickly once you’ve seen it appear in Termite. Hit the ZNP board reset button again and correlate the Z-Tool presentation with the raw serial message in Termite.

Now experiment with a few simple commands such as SYS_VERSION, SYS_PING, and SYS_RESET. UTIL_GET_DEVICE_INFO is also quite informative. You should be able to see the hardware MAC address (aka IEEEAddr) and the ShortAddress, which is used within the PAN. DeviceType is a little confusing; it shows the roles the firmware is capable of operating as, rather than the role which it currently has. The DeviceState for an unjoined device should be 0 = “DEV_HOLD”. More on device states later.

If feeling adventurous, wire some LEDs to the GPIO pins mentioned above and experiment with UTIL_LED_CONTROL and SYS_GPIO (you will definitely need to read the documentation carefully). Note that some SYS_GPIO settings interfere with UTIL_LED_CONTROL.

Also: try composing raw byte sequences using Termite, remembering the SOF and FCS bytes. You will find that ZNP will not respond if you make a mistake; in general, do not expect to receive an error message but DO expect to receive an explicit response message. The response might be a simple “success” response, some data, or a deferred indicator/callback as for the case of a restart.

None of these ZNP messages have gone beyond the ZNP board; they are only communicating with the ZNP firmware.

Preparing the ZNP Board as an End Device

So far, the ZNP board still thinks it is a coordinator; some configuration settings must be changed.

First, though, we should give some thought to the start-up procedure. Whether we’re thinking of a PC or a MCU communicating with the ZNP board, we need to know when it is ready. Refer to [R1]. In practice this amounts to waiting for SYS_RESET_RESPONSE (command id 0x4180) before attempting to send ZNP messages.

Although this configutation is only needed once, I think it makes sense to execute these commands with a script each time I change anything. For an MCU-based device in use, I would not normally perform these steps unless a jumper or button is active at start/reset time.

The command to use is ZB_WRITE_CONFIGURATION (Simple API) and each command should cause a ZB_WRITE_CONFIGURATION_RSP response from ZNP. Refer to [R2] for the structure of the ZB_WRITE_CONFIGURATION message and response (in the doc, “SREQ” is the request and “SRSP” is the response). The values for ConfigId are hidden in section 5.3 of [R3]. Note that Z-Tool expects the configIds to be provided in decimal form!

The following should be set:

ZCD_NV_LOGICAL_TYPE (configid=0x87=135d), value=0x02 for end device.
ZCD_NV_PAN_ID (configid=0x83=131d), a value of 0xffff means “don’t care”, which is appropriate for a single PAN environment as we want the device to join any PAN it finds.
ZCD_NV_CHANLIST (configid=0x84=132d), value = 4 bytes bitmask of channels to use. The LSB is channel 0, so the bitmask for just channel 11 would be 0x00 00 08 00. This is the compiled default and leaving it as-is does not stop the end device connecting to a coordinator on a different channel. Setting ZCD_NV_CHANLIST to match the coordinator channel will reduce scan-and-connect-to-PAN time because the end device will try channels in ZCD_NV_CHANLIST first, and then try others. I use channel 16 = binary 1 0000 0000 0000 0000 = 0x00 01 00 00. Note that in both Z-Tool and in the raw ZNP message, the value is set in little-endian byte order, so the bytes to send (in order) for channel 16 would be 00, 00, 01, 00 (remembering to use decimal form of each byte in Z-Tool).

I also set the ZCD_NV_POLL_RATE. This modifies the interval with which the device contacts the coordinator with IEEE 802.15.4 “Data Request” packets once joined (see Wireshark sniffer). The compiled default is 1000ms. THe documentation wrongly states this is config_id 0x24 and has 1 byte length. It is actually config_id = 0x35 and is a 4 byte value (in ms) expressed in little-endian form! I use a value of 15s, which matches the PTVO firmware. Expressing in ms and in little-endian order, this requires the series of bytes: 0x98 0x3A 0x00 0x00.

A likely cause of problems when “messing about” is caused by the way ZNP stores network state in non-volatile memory. While this is a good thing for devices in real use, speeding up their re-connection, it is helpful to always have a clean start if messing about… So, for experimental work, set the ZCD_NV_STARTUP_OPTION (configid=3) to a value of 2. This will clear the network state on restart. Issue a ZB_SYSTEM_RESET command to clear state immediately.

At this point the ZNP board is still unconnected from the network. It is not joined to a PAN and you will not see any network traffic to/from it in Wireshark (or other sniffer).

Joining a PAN

This involves two steps: first declaring to ZNP what “clusters” it offers/consumes, and second is telling ZNP to attempt to join a PAN.

There are some concepts and associated names for them which need to be understood before the first step can be completed.

Endpoint is a way of identifying separate features offered by a single physical device (identified by its MAC address and two byte PAN-network address). Consider an endpoint as a “virtual” device which shares a single physical device. For example, a Zigbee device controlling three separate lights would have three endpoints.

ClusterId is a way of identifying the set of attributes which a device feature supports. A simple on/off device has a cluster id of 0x0006. There is a Basic Cluster, id = 0x0000, which all devices should support. The Z2M “interview” involves the coordinator sending requests to read attributes in the Basic Cluster such as the device model and manufacturer. See [R5] and [R6].

InCluster, OutCluster, Server, Client are very confusing terms used in the Zigbee specifications. They often seem to be the wrong way! One way of deciding what they should be is to create a device using PTVO Configurable Firmware builder and then observe the messages using Wireshark. The InCluster list contains “server” features and the OutCluster list contains “client” features. Some examples: a light is a “server”, so is set up as an input cluster; a switch is a “client”, and a temperature sensor a “server”. I see that PTVO devices put the Basic Cluster as both an input and output cluster, but thinks work just fine with it defined as only an input cluster.

The first step, declaring the clusters uses the AF_REGISTER command (use Z-Tool), with the following values:

Endpoint = 1 (actually a free choice, but 1 seems logical!)
AppProfId must be set to the Home Automation Profile ID, which is 0x0104.
AppDeviceId should be chosen from the values given in [R5]. I used 0x0000.
AppDevVer can be freely chosen.
Set a length of 1 for the InClusters list and add the Basic Cluster Id.

After defining the endpoint, set Z2M to accept join requests and start Wireshark (remember to set the channel to match the Z2M coordinator).

Now for the second step. This will trigger network activity and a series of ZNP callback messages to inform you of the progress of joining. The command to use is ZDO_STARTUP_FROM_APP, setting a delay of 0 (appears as “Mode” in Z-Tool). The ZNP callbacks should comprise one ZDO_STARTUP_FROM_APP_RSP (with a network status code of 1) and several ZDO_STATE_CHANGE_IND messages. The status codes given by ZDO_STATE_CHANGE_IND are not in the documentation but can be found by referring to Components\stack\zdo\ZDApp.h in your Z-Stack 1.2 installation (find the devStates_t enum). The values I see follow the pattern: 2 DEV_NWK_DISC {potentially repeated} -> 3 DEV_NWK_JOINING -> 5 DEV_END_DEVICE_UNAUTH -> 6 DEV_END_DEVICE. Z-Tool incorrectly decodes status = 2 as “INVALID_PARAMETER”. If you see status 2 arriving multiple times it means you did not set the channel list to match the coordinator, and if you only ever see status 2, the coordinator is probably off-line or not accepting join requests. Status 6 means the device has joined the PAN as an end device and it should appear in the Z2M web interface.

You will now see a bunch of AF_INCOMING_MSG messages. These are from the coordinator to the ZNP device and are caused by Z2M’s “interview”, which is trying to read the attributes in the Basic Cluster. It will eventually give up. The ZNP board is now joined into the PAN but is not properly described in Z2M and can do nothing useful. Still: getting this far is an achievement.

Aside: there are also Simple API commands for this process, but I find the responses less informative.

Message Log

Here are the raw ZNP messages sent over UART and the responses (generated by a Python script – see Part 2). The responses have been parsed out to command + body, whereas the TX bytes include the SOF and FCS.

TX: fe 03 26 05 87 01 02 a4
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 04 26 05 83 02 ff ff a6
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 06 26 05 84 04 00 00 01 00 a4
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 03 26 05 03 01 02 20
[Response] RX body: Cmd: 6605 Body: 00
TX: fe 06 26 05 35 04 98 3a 00 00 b6
[Response] RX body: Cmd: 6605 Body: 00

And here is the AF_REGISTER and “start”.

TX: fe 0b 24 00 01 04 01 00 00 01 00 01 00 00 00 2b
[Response] RX body: Cmd: 6400 Body: 00
TX: fe 01 25 40 00 64
[Response] RX body: Cmd: 6540 Body: 01

In Wireshark, you should see an Association Request, followed (but probably not immediately) by an Association Response. Note that the protocol is IEEE 802.15.4, rather than “Zigbee”, as these are network-level interactions. See that the request and response source and destination use a MAC address and that the short address (2 bytes) which will subsequently be used (and is shown in Z2M) is provided in the response.

Further down the Wireshark log, you should see a “Transport Key” message being sent to the short address, now given as “Zigbee” protocol, followed by a “Device Announcement” where the newly-joined device broadcasts its presence.

To be Continued…

Part 2 will look at the request and response messages involved in the Z2M “interview”, and beyond to build a simple device to demonstrate reporting and remote control.