This blog documents a Clinical Intelligence Engine architecture template to integrate data generated by Augmedix into a data lake deployed by a healthcare enterprise based on the Google Cloud services. Clinical Intelligence Engine is a Google Cloud-based data analytics solution that proactively generates clinical intelligence and care recommendations for population care. The solution enables health systems to take action to improve their performance in terms of cost and quality. It can provide insight and intelligence, making it an essential tool for health systems pursuing clinical transformation.
Augmedix (Nasdaq: AUGX) delivers industry-leading, ambient medical documentation and data solutions to healthcare systems, physician practices, hospitals, and telemedicine practitioners.
Augmedix is on a mission to help clinicians and patients form a human connection by seamlessly integrating its technology at the point of care. Augmedix’s proprietary platform digitizes natural clinician-patient conversations, converted into comprehensive medical notes and structured data in real time. Augmedix platform uses the latest LLM technology by Google (MedLM, Gemini) and Speech-to-Text to generate accurate and timely medical notes added to the electronic health record (EHR).
The document guides enterprise architects who are experienced with data lake architecture and are familiar with relevant Google Cloud services. The architecture described in this document applies to organizations where a data lake already exists and are looking to integrate it with additional data sources or to organizations deploying a data lake for the first time. The proposed architecture is flexible and can integrate with more data sources.
Details and explanation about the components described in Reference Architecture (Figure 1):
Name |
Type |
Description |
Location in Pub/Sub message |
_noteID |
STRING |
Provided by Augmedix and used for querying additional data about a note |
/id |
_timeEncounterStart |
TIMESTAMP |
Recording start time |
/startTime |
_timeEncounterEnd |
TIMESTAMP |
Recording end time |
/endTime |
_encounterID |
STRING |
ID provided by the Electronic Medical Record (EMR) |
/encounterId |
_transcriptSummary |
JSON |
Notes generated by Augmedix |
/transcriptSummary |
_structuredData |
JSON |
Structured data with History of Present Illness (HPI) and complaints |
/noteContent/ComplaintsSelections |
_noteFull |
STRING |
Complete note with all subsections |
/noteSummary/fullNote |
_noteHPI |
STRING |
History of Present Illness (HPI) section of the note |
/noteSummary/HPI |
_noteAP |
STRING |
Assessment and Plan (AP) section of the note |
/noteSummary/AP |
_noteROS |
STRING |
Review of Systems (ROS) section of the note |
/noteSummary/ROS |
_notePE |
STRING |
Physical Exam (PE) section of the note |
/noteSummary/PE |
Data flow described in Reference Architecture (Figure 1):
The traditional source of truth in healthcare service delivery has been the electronic medical records system (EMR), but ambient documentation taps into a rich layer of data—the conversation between the clinician (provider) and patient—where much healthcare delivery materially takes place. Here, at this conversational level, at the point of care, resides a valuable data set that often never finds its way into the EHR, and to the extent it does, it is often made unstructured during EHR writeback. Augmedix’s structured data output to a data lake (outside of the EHR) preserves structured data from the point of care and enables several high value use cases such as:
The data payload that could be sent from Augmedix’s ambient documentation solution includes:
All data elements can be delivered with all the information or with reduction of Protected Health Information (PHI) (de-identified).
For a sample dataset, please refer to: Note Sample, Transcript Sample
This reference architecture recommends using Pub/Sub as the communication infrastructure between Augmedix and the enterprise’s HDE as described in the data flow section above. This approach is considered secure, scalable and keeps Augmedix’s solution and the enterprise's HDE highly decoupled. Alternatively, an approach of direct API calls could be implemented. In this approach, Augmedix will call a set of API endpoints, provided by the enterprise, to send the data payload or report encounter status. Augmedix has the infrastructure to quickly adapt to customer-provided APIs and has done that before already.
If you prefer this approach, please contact the Augmedix Partnerships team at partnerships@augmedix.com to get in touch with an integrations architect.
This reference architecture describes a one-way data flow, from Augmedix’s ambient documentation solution to the healthcare data lake. In addition to the encounter recording, Augmedix uses data from the EMR to improve the quality and comprehensiveness of the generated clinical notes. Advanced and highly utilized data lakes could hold data points from multiple systems, or generate valuable insights that could further enhance the clinical notes. Injecting these data elements and insights to Augmedix’s clinical notes generation process could yield significant value to the enterprise by generating higher quality notes, less missed charges and greater efficiency gains for the clinicians.
To discuss this option, please contact the Augmedix Partnerships team at partnerships@augmedix.com to get in touch with an integrations architect.
Data lakes typically aim to aggregate as much data as possible, to support future use cases. But these could be cases where the enterprise will prefer to filter the information it processes. Such use cases could be for cost control, load control on critical resources downstream, or privacy considerations. Augmedix has the ability to send only a subset of the encounters by applying filters on encounter type, specific providers, specific facilities, and more.
To discuss this option, please contact the Augmedix Partnerships team at partnerships@augmedix.com to get in touch with an integration architect.
Note: Augmedix holds the encounter data for one week before it is permanently purged. If the data is not exported to the data lake within this timeframe, it will not be available.
Augmedix guarantees 99.0% uptime SLA for its data export API and retains the data (raw transcripts, notes, structured data) for one week after the encounter data was sent to the EMR. This SLA should suffice to most data lake use cases under normal circumstances. Some events or unique use cases may require higher uptime SLA or to retain the data on Augmedix’s database for longer. For example:
In any of these cases, please promptly contact the Augmedix Partnerships team at partnerships@augmedix.com.
When designing your data lake, carefully consider if PHI is needed for your use cases. The general rule of thumb is ‘if you don’t absolutely need it, don’t deal with it.’ Augmedix’s ambient clinical documentation solution can export the data with PHI or de-identified.
If your data lake use cases do not require holding PHI, it is recommended that you use de-identified data export from Augmedix’s ambient clinical documentation solution. This way your database and the entire pipeline are kept clean of PHI. It will also permit broader access to the information. You may still need to link data from the ambient clinical documentation solution to the EMR record. The link (a token or a key) may be considered a PHI by itself and would need to be protected. You should consult with your compliance officer on the best approach to manage this link.
If your data lake use cases require PHI, you should make sure the design of the data lake complies with your organization’s HIPAA policy.
Please also note that the data exported from Augmedix’s ambient clinical documentation solution is not the final version of the clinician note. The clinician may edit the data in the EMR or in other systems after it was generated by Augmedix’s ambient clinical documentation solution. The source of truth and the final record for all clinical information is the EMR record.
Data lake use cases typically allow some latency in data processing. The cloud Pub/Sub serves as a string buffer to manage surpluses of load coming from Augmedix’s ambient clinical documentation solution. It is important to continuously monitor the number of unprocessed messages in the queue and the total time it takes to fully ingest a data payload into the data lake. If it exceeds the SLA with your stakeholders for how up-to-date the data needs to be, you should consider scaling the required services.
Augmedix guarantees sending the data to the data lake within 30 minutes of the time it was sent to the EMR. Please note that a few hours may pass between encounter completion and sending it to the EMR.
You can deploy the solution in the customer's GCP organization. All the Google Cloud products described in the architecture are generally available for customers to start immediately. Augmedix, out of the box, supports publishing messages to Pub/Sub.
You may choose one of the following two options for implementation support and services:
Here is a list that could provide more details on the Augmedix and Google products used in this reference architecture:
Whenever using clinical information, it is recommended to familiarize yourself with data protection regulations. It is recommended that you consult with your compliance officer, but this link is a good starting point:
https://www.hhs.gov/guidance/document/summary-hipaa-security-rule-0
Automated ambient clinical documentation that is based on AI, and specifically LLMs, is a very new and dynamic domain. Augmedix fully recognizes that this reference architecture should evolve over the next few months. Augmedix would appreciate your feedback and a discussion about your current and future needs. Your feedback is valuable to Augmedix and will shape its product roadmap to more closely match your use cases.