TECH PLAY

KINTOテクノロジーズ

KINTOテクノロジーズ の技術ブログ

969

This is article is the entry for day 8 in the KINTO Technologies Advent Calendar 2024 🎅🎄 Hello, this is HOKA from the Manabi-no-Michi-no-Eki (Learning Roadside Station) Team. It has been nearly a year since the team was launched, so I will be writing a blog post to reflect on its first twelve months. Our Thoughts and Aspirations One Year Ago The team was formed exactly one year ago, at the end of November 2023, when Kin-chan, Nakanishi, and I came together with the goal of making the company’s various study sessions more engaging and dynamic. Shortly after the beginning of 2024, we regrouped and created an inception deck. Here is the result: As a "roadside station" where study sessions converge, we aim to enhance internal activities focused on learning and knowledge sharing. Internal communications Keeping everyone informed about the study sessions being held. Sharing what study sessions are like. Supporting study sessions Providing guidance to those who have concerns, such as: Wanting to start study sessions but not knowing where to begin. Struggling to make their existing study sessions more engaging. There are more details in the Tech Blog post about how it all got started . What We Did in the First Six Months We started by sharing updates on the team activities during KTC’s monthly all-hands meetings, commonly known as HQ meetings. I imagine the slide below has become a familiar sight to everyone in the company. Activity summary Providing a space for discussing study sessions as a whole—offering advice on starting them, addressing challenges, and finding solutions. Popping Into Next Door Study Sessions – an initiative where the administration team visits various study sessions to engage and support participants. ​ KTC Podcast: interviews with study session organizers! Sharing voices from within the company, capturing both the messages and the atmosphere of the study sessions. Tech Blog: Articles in the Tech Blog showcasing the various study sessions at KTC and sharing experiences of participating in them. New activities — Part 1 After about six months, we started to see people supporting and advocating for our activities. We faced the challenge of making it easier for people to search for available study sessions. Fortunately, an engineer from the Mobile App Development Group developed a Slack-based search system to help solve this issue. This allowed people to search for study sessions in the Slack channel by interacting with a character named “Manabyi.” For more details, check out the developer’s blog post. https://blog.kinto-technologies.com/posts/2024-12-04_manabyi/ New activities — Part 2 We also faced the challenge of making study session materials and videos accessible to everyone at any time. As we were exploring solutions, an engineer from the Corporate IT Group proactively reached out and developed a SharePoint-based portal site for the study sessions. It is called the “Manabi-no-Michi-no-Eki Portal.” What was once a set of plain, unorganized folders has now been transformed into a YouTube-like experience. With all the videos and materials in one place, people can: Watch recordings or review materials from study sessions they missed. Rewatch a session they attended to reinforce what they learned. Other Highlights That Made Us Happy We began receiving requests from organizers seeking discussions on improving their study sessions. When we invited people to be featured on the podcast, they all eagerly agreed to participate. The term “Manabi-no-Michi-no-Eki” was featured on a poster for a company-wide event. Internal materials from group companies included information about Manabi-no-Michi-no-Eki as part of introducing KTC. Less than a year after starting, we never would have imagined that our activities would be reaching not only our own company but also group companies. Whenever employees spontaneously express their appreciation for our team’s activities, it gives us a tremendous boost of motivation. Joining the Developer Relations Group In September 2024, Manabi-no-Michi-no-Eki officially became part of the Developer Relations Group. What is the Developer Relations Group? In 2022, KINTO Technologies launched its Tech Blog and established a platform for engineers to share their knowledge through external events and internal study sessions. Delivering results through work is a form of output in itself, but we believe that study sessions and the Tech Blog also play a crucial role in providing engineers with valuable opportunities to share their knowledge and insights. The Technology Developer Relations Group has created outlets for output that can be described as being “between work duties.” When we launched Manabi-no-Michi-no-Eki at the end of 2023, it created a learning forum that fit into the space "between work duties"—essentially, a space for input. This naturally aligned with the activities of the Technology Public Relations Group. Notably, part of the group's founder, Nakanishi’s, vision was to support engineers' growth by guiding them from input to output. So, in a way, these two initiatives may have never truly been separate to begin with. What the Manabi-no-Michi-no-Eki Team will do in the future Please do check out Nakanishi’s TechBlog posts. https://blog.kinto-technologies.com/posts/2024-12-03-the-next-goal/
アバター
This article is the entry for day 7 in the KINTO Technologies Advent Calendar 2024 🎅🎄 Introduction Hello! I'm Iki, from the Cloud Infrastructure Group at the Osaka Tech Lab. In this article, I’ll share how the Osaka Tech Lab team applied the MVP development method to build a chat API powered by generative AI. What is MVP (Minimum Viable Product) Development? This method consists of first creating a product that has the bare minimum of features, then verifying and improving it while getting feedback from users, in this case, the Product Manager (PdM) who originally suggested creating it. I think it was an extremely efficient development method for handling the rough requirements we had for this project. What Inspired Us to Create a Chat AI The idea came about naturally. KINTO Technologies has offices in Tokyo, Nagoya, and Osaka (Osaka Tech Lab), but nearly 80% to 90% of its employees are based in Tokyo. Since work isn't assigned by location, everyone, regardless of their office, contributes to Tokyo-led projects. This setup isn't an issue at all —if anything, Osaka team members are particularly close-knit compared to those at other locations! (At least, in my opinion.) Still, at some point, the Osaka members started thinking wouldn’t it be great to work on something uniquely Osaka-based? Around that time, they happened to cross paths with a Product Manager who was exploring the potential of chat systems powered by generative AI. Theme For this project, we decided to use MVP development to verify whether chat using generative AI was a viable idea. However, the term “chat” covers a wide range of situations, with customer service and reporting to bosses also being examples. Rather than tense, stiff conversations, we focused on whether it could emulate the kind of natural, relaxed way that people chat with family and friends. What We’ve Managed to Create So Far Overall structure Note: Deciding that it would be better if there was a robot in the chat, we used BOCCO emo, a commercial robot by Yukai Engineering Inc. Azure schematic ![Azure schematic (simple version)](/assets/blog/authors/norio_iki/chmcha_azure_architecture.png =700x) What We Considered When Creating the Chat AI Time The goal this time was to explore whether conversations powered by generative AI could work. However, even if we succeeded by investing a lot of time and effort, generative AI-powered conversations already exist in the world, so achieving that is a given. In addition, the theme centers on conversation, a natural and intuitive part of human interaction. However, we soon realized that we had chosen an extremely insightful theme, sparking a wealth of ideas for what we aspire to create. Consequently, there would be no end to what we could do if we had the time. Thinking that it wasn't worth spending too much time to create an MVP, we proceeded with the creation with an aim of completing it within the timeframe we had set. How much time we spent on MVP creation Considering the requirements and creating the MVP 2 days Verifying and improving it while getting feedback 15 hours (max) What Did We Actually Do? Considering the requirements and creating the MVP At the beginning, we didn't selected an environment to develop the MVP, while lacking any knowledge of the way to create a generative AI-powered system. With things as they were, before we got as far as verifying the theme (i.e., whether natural, carefree chat can be achieved), we would have first had to start from verifying how to create a generative AI system. However, for expertise about generative AI systems, we got help from ZEN ARCHITECTS , the company that provides the Azure Light-up program. This created a situation that would enable us to focus on the theme. Besides accompanying us along the way as we were building the generative AI system, ZEN ARCHITECTS also gave us ideas based on actual experience (for example, things we should be careful about when using generative AI with a theme as rough as ours), and pulled us along so that we managed to complete the MVP in 2 days. Verifying and improving it while getting feedback Based on comments from actually trying it out, the development members discussed and decided on what improvements to make. We added a feature to let you chat from your current location, and in order to fix issues with robots that only ever chatted in cafes, we spent a month (15 hours) running a cycle of getting feedback on things people noticed about the prompts and so on. To verify the feature for chatting from your current location, we did not simply do it all from the office by changing our location data, but actually went to different places physically. Since KINTO is a car subscription provider, we also did fittingly unique real-time updates, such as ones deployed while receiving feedback in a car. Creating it in-house means we can do things like this! (We repeatedly ran verifications and made improvements with this attitude to guide us!) ![Deployment while driving](/assets/blog/authors/norio_iki/drive_deploy.jpg#right =400x) Conclusion The chat API using generative AI that we created in this project is currently undergoing in-house verification regarding its future potential. If its value exceeds how much it is expected to cost in the future, we plan to push ahead with developing it further. That said, continued development might get canceled if it is deemed to be premature. However, even if it does not get continued, the things we confirmed and experience in a new area (Azure development of generative AI systems) we gained will still remain as outcomes of the project. Those outcomes will fuel new ideas. (For example, they can be fed back into existing systems.) Even if this project does not get continued, I will not think of as a failure. I think we can create an environment that will enable us to constantly move forward and run the innovation cycle, in order to do MVP development that will not fail. I would like to push forward with tackling the challenges that arise with MVPs as an end unto itself! Also, if you want to know a little more about the details, ZEN ARCHITECTS have published a case study on the project, so please check that out. Azure Light-up
アバター
Hello. I'm Shimamura from the Platform Group's Platform Engineering team. I'm responsible for the development, operation, and deployment of tools (taking on roles similar to a team leader. Recently, I've also started working on scratch development as well, which has me struggling a bit with things like Scrum and programming languages) based on platform engineering principles. This article is the entry for day 4 in the KINTO Technologies Advent Calendar 2024 🎅🎄 Background There is a thing called SBOM (Software Bill of Materials). In 2024, the Ministry of Economy, Trade and Industry emphasized the importance of utilizing this for security measures . Similarly, back in 2021, the United States also discussed making it a standard practice. However, introducing new SaaS, OSS, and other tools for this purpose might seem like a significant hurdle. KINTO Technologies leverages OSS and incorporates it into the CI/CD pipeline to create and manage a standardized SBOM. In the context of DevSecOps, we also perform vulnerability scanning EOL inspection in conjunction with SBOM generation. I'll introduce "Minimal Start" and provide examples of improvements made along the way. Pros and Cons The most important thing is having visibility into what is being used. Some time ago, you may remember the vulnerability of log4j. There was an inquiry asking, "We don't use it, do we?" At that time, we communicated the issue by checking configuration files or applying common temporary workaround settings, but now you can find out in a single search. Pros EOL/SBOM management can be started for free ! 👌 There are paid services such as Yamory(Cloud Service) and BlackDuck . There is also SBOM-TOOL provided by Microsoft for SBOM generation. Paid software or SaaS services come with application processes and other hassles. The tools we use are available as GitHub Actions, making them easy to use. It can be run locally, so there are many potential use cases to explore. Cons Many of the tools, like EOL and others, are supported by volunteers. Tools like xeol and syft are frequently updated, and sometimes file inconsistencies occur due to version changes. Tool List Name Function Overview Syft SBOM generation A software provided by Anchore to generate SBOMs from files and container images. Both CycloneDX and SPDX, the standard SBOM format, are supported, but note that the standard is Syft's proprietary format. At KINTO Technologies, we use CycloneDX in XML format for output. Because with JSON, if the version changes, the configuration will change considerably and consideration will increase when importing. XEOL EOL Scanner A scanner to determine whether the EOL provided by XEOL is included. The internal configuration is based on Syft and is determined by matching the information in endoffile.date . The major languages and operating systems are supported, and the response to issues is impressively quick. While SaaS is also available, we'll use OSS for this case. Trivy Vulnerability Scanner A vulnerability scanner provided by aqua . Although Anchore also offers a vulnerability scanner called Grype , which pairs well with Syft, we chose Trivy instead because its behavior on CI/CD pipelines (e.g., display) is more favorable for vulnerability detection. It can scan a wide range of targets, including files, repositories, container images, and SBOMs, making it versatile. While it can also generate SBOMs, since XEOL's core relies on Syft, we opted to use Trivy exclusively for vulnerability scanning this time, considering compatibility. GitHubActions CICD tool The CI/CD tool included in GitHub. At KINTO Technologies, we utilize GitHub Actions for tasks such as building and releasing applications SBOM management incorporates SBOM generation into the workflow at the timing of container creation. CMDB (in-house production) CMDB Configuration Management Database。 A configuration management database. Since rich functions were unnecessary, KINTO Technologies has developed an in-house CMDB. Recent additions include repository information management, EOL information, and SBOM packages to be retrieved and searched. Workflow Diagram Excerpt Pipeline (GitHub Actions) :::message I excluded Container Build and Push steps, as their timing is left to individual judgment. Since the target of Trivy's vulnerability scan is images, the build process is assumed to occur before this excerpt. ::: Basically, This task should be executed during the Deployment Phase rather than during the Application Build Phase. While version management for SBOM files was an option, we chose to overwrite them, as the latest version is deemed sufficient. Note that incorporating into the Build will result in an SBOM that does not reflect the container actually present in the workload. The reason the pipeline calls XEOL twice is to display logs to GitHub Action and for file generation. Since the modes appear to be different, a single unified operation wasn't possible, we decided to separate them. ## Vulnerability scanning with Trivy - name: Run Trivy vulnerability scanner uses: aquasecurity/trivy-action@master with: image-ref: '${{ ImageName }}:${{ IMAGETAG }}' format: 'table' exit-code: '0' ## If you want to proceed with ImageBuild/Push despite vulnerabilities, set this to "0" ignore-unfixed: false vuln-type: 'os' ## If you want to include Java, etc., add "library" severity: 'CRITICAL,HIGH' ## SBOM generation with SYFT - name: Run Syft make sbom files(format cyclone-dx) uses: anchore/sbom-action@v0 with: image: '${{ ImageName }}:${{ IMAGETAG }}' format: cyclonedx artifact-name: "${{ github.event.repository.name }}-sbom.cyclonedx.xml" output-file: "${{ github.event.repository.name }}-sbom.cyclonedx.xml" upload-artifact-retention: 5 ## Artifact expiration date ## EOL library detection (display in WF) and file creation from SBOM in XEOL - name: Run XEOL mw/sw EOL scanner from sbom file uses: noqcks/xeol-action@v1.1.1 with: sbom: "${{ github.event.repository.name }}-sbom.cyclonedx.xml" output-format: table fail-build: false - name: Run XEOL mw/sw EOL scanner from sbom file and Output file uses: noqcks/xeol-action@v1.1.1 id: xeol with: sbom: "${{ github.event.repository.name }}-sbom.cyclonedx.xml" output-format: json fail-build: false ## AWS credential settings (SBOM) - name: AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ S3_ACCOUNT_ROLE }} aws-region: ${{ AWS_REGION }} ## Save SBOM/EOL in the Team's S3Bucket to be managed ## Extracted with Cut, as it is hierarchical in S3 by image name or repository name. - name: SBOM and EOL file sync to s3 bucket run: | ECRREPOS=`echo ${{ ImageName }} | cut -d "/" -f 2,3` echo $ECRREPOS aws s3 cp ${{ github.event.repository.name }}-sbom.cyclonedx.xml s3://${{ s3-bucket-name }}/${{ github.event.repository.name }}/$ECRREPOS/sbom-cyclonedx.xml aws s3 cp ${{ steps.xeol.outputs.report }} s3://${{ s3-bucket-name }}/${{ github.event.repository.name }}/$ECRREPOS/eol-result.json Improvements Made and Future The results of checking whether my department is using the AWS-CORE library. The results of checking whether my department has EOL components. Since KINTO Technologies has integrated SBOM/EOL lists into the CMDB, allowing us to search and confirm the following: What libraries are being used in my product? Are there any components that have reached EOL? With the in-house CMDB, libraries and packages approaching EOL are also highlighted, making it easy to respond in advance. The first step of SBOM management is to output SBOM files in JSON, formatting them with jq, and manage them in Excel. In the future, we would like to start the operation of periodically requesting responses by ticket to products that include EOL. This will involve collaboration with the security team and many others. Impressions I looked into tools for EOL management, but there aren't many dedicated ones; most seem to be extensions of SBOM management, software management, or asset management. Given the prevalence of development involving OSS and libraries, regularly monitoring EOL can be highly effective as part of vulnerability countermeasures. Libraries often reach EOL within 1 to 2 years, requiring frequent updates to keep up with version changes. Being able to see the current state is a strong countermeasure against "not noticing" or "ignoring" issues. Similar to observability (O11y), the first goal should be to "make it visible." In fact, it may not be effective unless it is built up to the operational step which includes requests for fixes. However, I wrote this article under the title "Starting with Minimal" to say, "Let's build it first! Let's get started!" Conclusion The Platform Engineering team manages and develops tools used internally across the organization. We leverage tools and solutions created by other teams within the Platform Group. Based on the company's requirements, we either develop new tools from scratch or migrate existing components as needed. We are automating routine tasks for the MSP team and are also starting to explore CDK, so we started programming in addition to Managed Service. If you’re interested in these activities or would like to learn more, please don’t hesitate to reach out to us. @ card
アバター
This article is the entry for day 3 in the KINTO Technologies Advent Calendar 2024 🎅🎄 It's incredible to think that nearly a year has already gone by since the Learning Station was established! I’m Nakanishi from the DevRel (Developer Relations) Group, and I strongly believe that the Learning Roadside Station team plays a pivotal role in enhancing KINTO Technologies' learning culture. In the DevRel team, we work tirelessly connecting the dots of individual growth —from input to output— to reshape our organizational culture. A major shift supporting our engineering culture This year brought a significant change to our engineering culture: the establishment of the DevRel Group. Until recently, our activities had only been carried out as a project. However, we officially formed a team this spring, making it a fully-fledged part of the organization. From now, some team members are dedicated full-time, no longer balancing this role with other responsibilities. This reflects an important milestone for the company in its effort to empower employees to share their knowledge and insights. Moreover, this year, in addition to establishing the team, we gained recognition for the input-focused initiatives we had envisioned from the early days of the Tech Blog’s creation. Explaining the Technical PR Group’s activities using a car analogy If we were to compare our output-focused efforts to a car, it would be like replacing a muffler or modifying the engine structure to improve exhaust efficiency. On the other hand, our input-focused initiatives, represented by the Learning Roadside Station, is like switching the fuel from regular gasoline to premium or even nitro. It’s also comparable to transitioning from carburetors to fuel injection, or adding a turbocharger. In other words, this team is responsible for transforming the fuel and figuring out how to inject it into the engine as efficiently as possible. If we think of the engine as representing individuals or teams, the type and amount of input needed will differ for each engine. For example, diesel engines and gasoline engines require different methods of fuel input, and the efficiency also varies depending on the engine’s displacement. Centralizing learning within the company I may have taken a long introduction, but at the Learning Roadside Station, we are working to optimize learning inputs for each individual and create a system that leverages the strengths shared among employees. Over the past year, we have focused on centralizing learning activities. This included visualizing study sessions, sharing and promoting study session information, supporting their organization, and consolidating scattered study resources across the company into one place. From now on, we will focus on "K to K" to bring people together and strengthen connections across the company. What is "K to K"? "K to K" stands for "KINTO Technologies to KINTO Technologies". It’s all about helping employees share their skills and learn from one another. For example, if some says: "I don't have enough analytical skills." -> One could consult with the Analysis Group. "I want to learn coaching." -> Ask person A who’s could organize coaching sessions. "I’m curious about project management." -> Reach out to PdM (Product Management) team members. By connecting the energy of those who wish to learn with the energy of those who want to share their expertise, such as: "I want to make better use of my skill A." or “I know someone who has this skill but is looking for the right opportunity to use it”, we can help bring out the best in our employees and energize the organization. The next action for the Learning Roadside Station is to focus on what we do best: bringing out the strengths of individuals and teams and connecting the dots across the organization . We look forward to tackling the challenges that lie ahead and are excited to see where our efforts will take us in the coming year. Other activities As part of our input-focused initiatives, we are also considering an expanded version of the podcast we currently run. At the moment, our main activity is interviewing those who host study sessions. However, similar to “K to K” activities, there are many things we would like to share across the company. By transforming these efforts into podcast content, we hope to create more opportunities for employees to learn and even provide a place for output for sharing knowledge beyond the company. Additionally, we are implementing Udemy Business as part of our efforts. We aim to explore how to use video content effectively and make the most of this platform. Conclusion Along with tech blogs, events, and presentations, we aim to expand the scope of the Learning Roadside Station as the next central pillar of our efforts. By continuously growing our input-focused initiatives, we hope to broaden our business horizons and create an environment where every employee can grow, thrive, and fully utilize their unique strengths. We will continue to actively share our progress and plans next year, so please stay tuned!
アバター
Hello. We are @p2sk and @hoshino from the DBRE team. The DBRE (Database Reliability Engineering) team is a cross-functional organization focused on resolving database (DB) issues and developing platforms. In this article, we introduce an automatic review function for DB table designs built with a serverless architecture using AWS's generative AI service “ Amazon Bedrock .” This function works with GitHub Actions, and when a pull request (PR) is opened, the AI ​​automatically reviews it and proposes corrections as comments. We also explain how we designed and implemented the evaluation of generative AI applications. We explain the evaluation methods adopted for each of the three phases in the LLMOps lifecycle (i.e., model selection, development, and operation phases), and in particular introduce generative AI- based automated evaluation utilizing "LLM-as-a-Judge" in the operation phase. Purpose of this article This article aims to provide easy-to-understand information on generative AI application evaluation, ranging from abstract concepts to concrete implementation examples. By reading this article, we hope that even engineers who do not have specialized knowledge of machine learning, like our DBRE team, will gain a better understanding of the generative AI development lifecycle. We will also introduce the challenges we faced when using generative AI in our services and how we solved them, with concrete examples. In addition, you can read this article also as one implementation example for the "Considerations and Strategies for Practical Implementation of LLM" introduced in the session " Best Practices for Implementing Generative AI Functions in Content Review " held at the recent AWS AI Day. I hope this article will be of some help to you. Table of Contents This article is structured as follows. This is a long article, so if you’re just interested in how the system works, I recommend you read up to the section on the "Completed System." If you’re interested in developing generative AI applications, I recommend you continue reading beyond that. Background Design Completed System (with demo video) Ideas in Implementation Evaluation of Generative AI Applications Lessons Learned and Future Prospects Conclusion Background Importance of database table design Generally, DB tables have the characteristic that once they are created, they are difficult to modify. As the service grows, the amount of data and frequency of references tend to increase, so we want to avoid as much as possible carrying technical debt that makes us regret later, saying, "I should have done it this way at the design stage...". Therefore, it is important to have a system in place that allows tables to be created with "good design" based on unified standards. " 10 Things to Do to Get Started with Databases on AWS " also states that a table design is still a valuable task, even though database management has been automated in the cloud. In addition, the spread of generative AI is making the data infrastructure even more important. Tables designed with uniform standards are easy to analyze, and easy-to-understand naming and appropriate comments have the advantage of providing good context for generative AI. Given this background, the quality of DB table design has a greater impact on an organization than ever before. One way to ensure quality is creating in-house guidelines and conducting reviews based on them. Our current situation regarding reviews At our company, table design reviews are carried out by the person in charge of each product. The DBRE team has provided the "Design Guidelines," but they are currently non-binding. We considered having DBRE review the table designs of all products across the board, but since there are dozens of products, we were concerned that if DBRE acted like a gatekeeper, it would become a bottleneck in development, so we gave up on the idea. Against this background, we, DBRE team, have decided to develop an automatic review system that acts as a guardrail and apply the system to our products. Design Abstract architecture diagram and functional requirements The following is the abstract architecture diagram for the automatic table design review function. To continuously perform automated reviews, it is important to integrate them into the development workflow. For this reason, we have adopted a system that triggers via a PR the automatic execution of an application on AWS and provides feedback within the PR, including comments on suggested corrections to the table definition (DDL). The requirements for an application are as follows: The ability to set our company's own review criteria To complement human reviews, it should be as accurate as possible, even if not 100%. Policies for implementing review function There are two possible policies for automating table design reviews: (1) Review through syntax analysis and (2) Review through generative AI. The characteristics of each policy are summarized as follows: Ideally, (1) should be applied to review criteria that can be handled through syntax analysis, and (2) to other review criteria. For example, the verification of the naming rule "Object names should be defined in Lower Snake Case" can be handled by (1). On the other hand, subjective criteria, such as "giving object names that allow inferring the stored data," are better suited for (2). Ideally, the two policies should be used separately depending on the review criteria, but this time we have decided to implement only "(2) Review through generative AI" for the following reasons. (1) is feasible, but (2) is something whose feasibility we cannot determine until we try it, so we have decided it is worth attempting first. By implementing the items that can be handled by (1) also in (2), we aim to gain insights into the accuracy and implementation costs of both policies. Review target guidelines To shorten the time to delivery, we have narrowed the review items down to the following six: An index must comply with the DBRE team's designated naming rules. Object names are defined in Lower Snake Case. Object names must consist of alphanumeric characters and underscores only. Object names must not use Roman characters. Object names that allow inferring the stored data must be assigned. Columns that store boolean values ​​should be named without using "flag". The top three can be addressed with syntactic analysis, but the bottom three are likely to be better addressed with generative AI, which also provides suggested corrections. Why create a dedicated system? Although several "systems (mechanisms) for review using generative AI" already exist, we have determined that they do not meet our requirements, so we have decided to create a dedicated system. For example, PR-Agent and CodeRabbit are well-known generative AI review services. Our company has also adopted PR-Agent for reviewing codes and tech blogs . In addition, GitHub Copilot's automated review function is currently available as Public Preview and may become generally available in the future. This function also allows you to have your code reviewed in Visual Studio Code before pushing it, and it is expected that the "generative AI review system" will become more seamlessly integrated into development flows in the future. Additionally, you can define your own coding standards in the GitHub management screen and have Copilot review based on them. The following are some reasons why we want to build our own system: It is difficult to check a large number of guidelines with high accuracy using generative AI, and we have determined that it is currently challenging to handle this with external services. We want to adjust the feedback method flexibly. Example: Columns like "data1" have ambiguous meanings, so it is difficult to suggest corrections, so we want to keep them in comments only. In the future, we aim to improve accuracy with a hybrid structure combining syntax analysis. Next, we will introduce the completed system. Completed system Demo video After the PR is created, GitHub Actions is executed, and the generative AI provides feedback on the review results as comments on the PR. The actual processing time is approximately 1 minute and 40 seconds, but the waiting time has been omitted from the video. The cost of the generative AI when using Claude 3.5 Sonnet is estimated to be approximately 0.1 USD per DDL review. https://www.youtube.com/watch?v=bGcXu9FjmJI Architecture The final architecture is as shown in the diagram below. Note that we have built an evaluation application separately to tune the prompts used, which will be described in detail later. Process Flow When a PR is opened, a GitHub Actions workflow is triggered, and an AWS Step Function is started. At this stage, save the PR URL and the GITHUB_TOKEN generated in the workflow to DynamoDB. The reason for not passing DDL directly to Step Functions is to avoid input character limits. Extract the DDL on the Lambda side based on the PR URL. Step Functions uses a Map state to review each DDL in parallel. Only one guideline should be checked per review. To review based on multiple guideline criteria, the "post-correction DDL" obtained from the first prompt is repeatedly passed to the next prompt, generating the final DDL (the reason will be explained later). After completing the review, provide feedback as a comment on the PR. The review results are stored in S3, and the generative AI evaluates them using LLM-as-a-Judge (more details will be provided later). Examples of the results are shown below. As feedback from the generative AI, the "applied guidelines" and "suggested corrections" are provided as comments (on the left side of the image). The details have been collapsed and can be expanded to check the specific corrections and comments made to the DDL (on the right side of the image). Steps required for implementation The table design review feature can be implemented in just two steps. Since it can be set up in just a few minutes, you can easily introduce it and start the generative AI review immediately. Register the key required to access AWS resources in GitHub Actions Secrets. Add a GitHub Actions workflow for the review function to the target GitHub repository. Simply add the product name to the template file provided by the DBRE team. Next, we will introduce some of the ideas we came up with for our implementation. Ideas in Implementation Utilization of container images and Step Functions Initially, we planned to implement it using only Lambda, but we encountered the following challenges. The library size is too large and exceeds the 250 MB deployment package size limit for Lambda. When chaining and evaluating multiple guideline criteria, there is a concern that the maximum execution time of Lambda (15 minutes) may be reached. When serially processing DDLs, the execution time increases as the number of DDLs increases. To solve issue 1, we adopted container images for Lambda. To solve 2 and 3, we introduced Step Functions and changed the design so that each Lambda execution evaluates one DDL against one guideline criterion. Furthermore, by using Map state to perform parallel processing for each DDL, we ensured that the overall processing time is not affected by the number of DDLs. The diagram below shows the implementation of the Map state, where the prompt chain is realized in the loop section. Measures against Bedrock API throttling During the review, Bedrock's InvokeModel requests occurred according to the number of DDLs multiplied by the number of guidelines , and errors sometimes occurred due to quota limits. According to the AWS documentation , this limit cannot be relaxed. For this reason, we introduced a mechanism to distribute requests on a per-DDL basis across multiple regions and, in the event of an error, retry in yet another region. This has led to stable reviews, mostly without reaching the RateLimit. However, we are currently using cross-region inference , which dynamically routes traffic among multiple regions and allows us to delegate throttling countermeasures to AWS, so we plan to transition to this in the future. Organizing the way to grant permissions to execute GitHub API from Lambda In order to enable Lambda to "obtain the changed files of the target PR" and "post comments on the target PR," we compared the following three methods of granting permissions. Token type Expiration date Advantages Disadvantages Personal Access Token Depending on the setting, it can be unlimited The scope of the permissions is broad. Dependency on individuals GITHUB_TOKEN Only during workflow execution Easy to obtain Concerns about insufficient permissions depending on the processing of the target GitHub App(installation access token) 1 hour Granting permissions not supported by GITHUB_TOKEN is also possible Increased complexity of the steps involved in introduction to a product This time we have adopted GITHUB_TOKEN for the following reasons: Tokens are short-lived (only for the duration of the workflow) and pose a low security risk. Token issuance and management are automated, reducing the operational burden. Permissions necessary for this processing can be granted. The tokens are stored in DynamoDB with a time to live (TTL) and are retrieved and used by Lambda when needed. This allows you to use tokens safely without needing to check whether the token-passing process has been logged. In the following, we present evaluation examples of generative AI applications. Evaluation of Generative AI Applications For generative AI application evaluation, we referred to the diagram below from Microsoft's documentation . Source: Microsoft - Evaluation of Generative AI Applications According to this diagram, there are three types of evaluations that should be performed during the GenAIOps (LLMOps, because the target in this case is LLM) lifecycle. Model selection phase Evaluating the base models and deciding which model to use Application development phase Evaluating the application output (≒ response of the generative AI) from the perspectives of quality, safety, etc., and tuning it Post-deployment operation phase Even after deployment to the production environment, quality, safety, etc., are evaluated continuously. Below, we will introduce some examples of how evaluations were conducted in each phase. Evaluation during the model selection phase This time, we selected an Amazon Bedrock platform model and evaluated it based on the scores from Chatbot Arena and the advice of our in-house generative AI experts and adopted Claude from Anthropic. We reviewed the DDL using Claude 3.0 Opus, which was the highest-performing model at the time of our launch, and confirmed its accuracy to a certain extent. Each model has different base performance, response speed, and monetary costs, but since reviews in this case are infrequent and there is no requirement for "maximum speed," we selected the model with the greatest emphasis on performance. Based on Claude's best practices, we determined that further accuracy could be achieved through prompt tuning and moved on to the next phase. Meanwhile, the higher-performance and faster Claude 3.5 Sonnet was released, which further improved the inference accuracy. Evaluation during the application development phase Generative AI evaluation methods are clearly summarized in the article here . As the article states, "Various evaluation patterns are possible depending on the presence or absence of a prompt, foundational model, and RAG," the evaluation pattern will vary depending on "what is being evaluated.” This time, we will focus on the evaluation of a "single prompt" and provide a concrete example of the design and implementation for the specific use case, which involves "having a database table design reviewed according to our company's guidelines." Prompt tuning and evaluation flow Prompt tuning and evaluation were carried out according to the diagram below, as described in Claude's documentation . Source: Anthropic company - Create strong empirical evaluations The key point is to define an evaluation perspective, such as "how close the prompt execution result is to expectations," as a "score calculated in some way," and to adopt the prompt with the best score. Without an evaluation system (mechanism), determining the improvement in accuracy before and after tuning may rely on subjective judgment, which could lead to ambiguity and an increase in work time. In the following, we will first introduce the generative AI evaluation method, followed by examples of prompt tuning. What is "generative AI evaluation"? The page for the generative AI evaluation product called " deep checks " states the following about evaluation. Evaluation = Quality + Compliance I felt that this was the most concise way to evaluate generative AI applications. Breaking it down further, the article here classifies the criteria for evaluating service providers into four perspectives: "truthfulness, safety, fairness, and robustness." The evaluation criteria and score calculation method should be selected according to the properties of the application. For example, Amazon Bedrock uses different metrics for different tasks, such as "BERT score" for text summarization and "F1 score" for question answering. Method for calculating evaluation scores anthropic-cookbook classifies the methods for calculating scores into the following three main categories: Summary of the score calculation methods described in anthropic-cookbook You can choose to use cloud services, OSS, or create your own score calculation logic. In any case, you need to set your own evaluation criteria. For example, if the LLM's output is in JSON format, "matching each element" may be more appropriate than "matching the entire string." Regarding model-based grading, the code provided in anthropic-cookbook can be expressed more concisely as follows: def model_based_grading(answer_by_llm, rubric): Prompt = f""" Evaluating the answer within the <answer> tag based on the perspective of the <rubric> tag. Answering "correct" or "incorrect." <answer>{answer_by_llm}</answer> <rubric>{rubric}</rubric> """ return llm_invoke(prompt) # Pass the created prompt to LLM for inference rubric = "Correct answers must include at least two different training plans." answer_by_llm_1 = "The recommended exercises are push-ups, squats, and sit-ups." # Actually, the output of LLM grade = model_based_grading(answer_by_llm_1, rubric) "print(grade) # It should be output as "correct" answer_by_llm_2 = “The recommended training is push-ups.” # Actually, the output of LLM grade = model_based_grading(answer_by_llm_2, rubric) print(grade) # It should be output as "incorrect." Summary of evaluation To summarize the content covered so far, the evaluation method is illustrated as the diagram below. Abstractly, the evaluation of generative AI breaks down into Quality and Compliance. These are further broken down, and specific evaluation criteria are set for each use case. Each criterion needs to be quantified, and this can be achieved based on “Code,” “Human,” or “Model.” In the following, we will explain the specific evaluation method from the perspective of "database table design review." Evaluation design in DB table design review We chose a code-based approach to quality evaluation for the following reasons: The cycle of evaluation and tuning by humans increases man-hours and is not worth the resulting benefits. We also considered a model-based approach, but since we wanted to assign the best score for a perfect match with the correct DDL, we concluded that a code-based approach was more appropriate. Since it is difficult to newly implement "similarity in DDL" with the correct data, we adopted the Levenshtein Distance, a method for measuring the distance between texts, as the score calculation method. With this method, a perfect match has a distance of 0, and the higher the value, the lower the similarity. However, since this is not an indicator that completely represents "similarity in DDL," we basically aimed for a score of 0 for all datasets and performed prompt tuning on datasets with non-0 scores. The algorithm is also provided by LangChain's String Evaluators (String Distance) , which is what we use. On the other hand, from a compliance perspective, we decided that it was unnecessary this time because it is an in-house application and the implementation limits user input embedded in the prompt to DDL. Implementation of evaluation The flow of the implemented evaluation is as follows. For each review perspective, we created 10 patterns of datasets combining input DDL and correct answer DDL. To efficiently repeat prompt tuning and evaluation, we developed a dedicated application using Python and Streamlit . The dataset is saved in jsonl format, and when you specify the file, the evaluation is automatically performed and the results are displayed. Each json contains the "evaluation target guideline", "parameters for invoking LLM", "input DDL", and "correct DDL" as shown below. { "guidline_ids": [1,2], "top_p": 0, "temperature": 0, "max_tokens": 10000, "input_ddl": "CREATE TABLE sample_table (...);", "ground_truth": "CREATE TABLE sample_table (...);" } When displaying individual results, the diff between the output DDL and the correct DDL is displayed, making it possible to visualize the differences (= tuning points). Once the evaluation is complete, you can also check the aggregated score results. Prompt tuning Based on Claude's documentation , we created and tuned the prompts, keeping the following points in mind, and ultimately achieved the best results (0 score) for almost all 60 datasets. Setting roles Utilizing XML tags Letting Claude think (instructing Claude to show its thinking process to make debugging easier when the answer is disappointing) Few-Shot Prompting (providing example output) Putting reference data at the beginning and instructions at the end Giving clear and specific instructions Chaining prompts There are many well-known techniques, so we won't go into detail here, but let us provide some additional information on the last two points. "Giving clear and specific instructions" Initially, we embedded the text of our in-house table design guidelines directly into the prompt. However, the guidelines only described "how things should be" and did not include "specific steps to correct the errors." Therefore, we rewrote them into "specific correction instructions" in a step-by-step format. For example, we revised the guideline "Do not use xxx_flag for column names that store boolean values" as follows: Follow the steps below to extract the column name storing the Boolean value and change it to an appropriate column name if necessary. 1. Extract the name of the column that stores Boolean values. The criterion is whether the column uses the Boolean type or contains the word "flag." 2. Check the names of the Boolean columns one by one and understand the meaning of the columns first. 3. Check the names of the Boolean columns one by one, and if you determine there is a more appropriate name, modify the column name. 4. Regarding the appropriate column name conditions, refer to the <appropriate_column_name></appropriate_column_name> tag. <appropriate_column_name> Without using the word "flag"... ... </appropriate_column_name> "Chaining prompts" The more guidelines there are to check, the more complex the prompts will become if you try to check them all in one go, raising concerns that checks will be missed or accuracy will decrease. For this reason, we limited the items that the AI ​​checks in each prompt execution to one. We also reflected this in the architecture by passing the "corrected DDL" obtained in the first prompt as input for the next prompt (Chain), and repeating the process to obtain the final DDL. Chaining prompts also offers the following advantages: Since prompts are short and tasks are limited to one, accuracy improves. When adding guidelines, you only need to create a new prompt, so there is no impact on the accuracy of existing prompts. On the other hand, the time and financial costs will increase as the number of LLM Invokes increases. Evaluation during the post-deployment operational phase In the prompt creation stage, we evaluated the quality using manually created correct answer data. However, in a production environment, no correct answer data exists, so a different approach for evaluation is required. So, we adopted LLM-as-a-Judge, an approach where an LLM evaluates its own responses. According to the Confident AI documentation , there are three methods to this approach. Single Output Scoring (no correct answer data) Giving the "LLM output" and "evaluation criteria" and having the LLM provide a score based on the criteria. Single Output Scoring (with correct answer data) In addition to the above, "correct answer data" is also provided. A more accurate evaluation can be expected. Pairwise Comparison Comparing two outputs and determining which is better. You define the criteria for "better" yourself. This time, we used Single Output Scoring (with no correct answer data). This approach is also supported by LangChain , and we used the provided function. Currently, implementation by LangSmith is recommended. The following two criteria are defined, and each is scored on a 10-point scale. Appropriateness Has the LLM output been appropriately corrected in accordance with the guidelines? Formatting Consistency Are there no unnecessary line breaks or spaces, and is the format consistent? The code and prompt images are below: input_prompt =""" <input_sql_ddl>CREATE TABLE ...</input_sql_ddl> <table_check_rule>Ambiguous object names are...</table_check_rule> Instructions: Based on table_check_rule, correct input_sql_ddl to the appropriate DDL. """ output_ddl = "CREATE TABLE ..." # Actually, DDL generated by LLM is set appropriateness_criteria = { "appropriateness": """ Score 1: ... ... Score 7: Responses generally following the input instructions have been generated with no more than two inappropriate corrections. Score 10: Responses completely following the input instructions have been generated. """ } evaluator = langchain.evaluation.load_evaluator( "score_string", llm=model, criteria=appropriateness_criteria ) result = evaluator.evaluate_strings( prediction=output_ddl, input=input_prompt ) print(result) This implementation produces the following outputs: (Some parts omitted) This answer completely follows the given instructions. The following are the reasons for the evaluation. 1. Extraction and evaluation of column names: The answerer has extracted all column names and appropriately judged whether each column name could infer the contents of the data. 2. Identification of ambiguous column names: All column names in the provided DDL clearly indicate their purposes and the type of stored data. For example, ... ... This answer fully understands and properly executes the given instructions. Rating: [[10]] This mechanism corresponds to the red boxes in the architecture diagram below. Once the LLM review results are stored in S3, the Lambda for LLM-as-a-Judge is launched asynchronously via SQS. This Lambda performs the evaluation, stores the results as logs in S3, and sends the score as a custom metric in CloudWatch. Moreover, CloudWatch Alarm will notify Slack if the threshold is not satisfied. This score is not 100% reliable, but since it is intended for an in-house system, it is an environment that makes it easier to get feedback from users. So, we have established a system to continuously monitor the performance using quantitative scores and collect user feedback on a regular basis. Lessons Learned and Future Prospects Finally, we will summarize what we learned from our attempt at developing generative AI applications and our future direction. Evaluation is very important but difficult By evaluating the prompt results from the same perspective, we were able to quickly repeat tuning and evaluation while eliminating subjectivity. This experience strongly highlighted the importance of evaluation design. However, the three evaluations in GenAIOps (during model selection, development, and operation) need to be judged for each use case, and we felt that "judging the validity of our evaluation design" was difficult. Furthermore, a lack of evaluation perspectives also poses the risk of delivering applications with compliance issues. We believe that, in the future, the provision of more systematic and managed evaluation methods and mechanisms will make it easier to realize GenAIOps. Our scope of imagination for generative AI use cases has broadened. By conducting research and implementing generative AI applications ourselves, we were able to gain a clearer understanding and broaden the range of use cases we can imagine. For example, we can now envision a system that combines agents with mechanisms for collecting information on lock contention , enabling more managed and faster incident investigations. Utilizing generative AI as a replacement for programmable tasks There are the following two main ways to use generative AI in application development. Having generative AI perform the task itself. Improving the productivity of program development with generative AI This time, we used generative AI to implement not only tasks that should be inferred by generative AI, but also tasks that would normally be processed by a program. Initially, we had hoped that by devising innovative prompts, we might be able to obtain high-precision results quickly. But we soon became acutely aware that prompt tuning actually requires a lot of time. On the other hand, by running multiple Claude models on the same task, we found that the more accurate the model, the clearer the improvement in results. Furthermore, we found that more accurate models reduce unpredictable behaviors and require less time for prompt tuning. Based on these experiences, if model accuracy continues to improve in the future, depending on the requirements, there may be an increasing number of cases where the approach of having a generative AI perform a task itself, instead of having it write a program to do so, will be chosen. Future prospects In the future, we plan to focus on the following: Expansion of response guidelines Expansion of introduced products Creating a hybrid configuration with programmatic syntax analysis Improved clarity when providing feedback on review results to users Expanding the current simplified LLMOps to not only include monitoring but also enable prompt and model improvements using logs Reference: Post by @Hiro_gamo Conclusion In this article, we introduced an automated review function for database table design, implemented using Amazon Bedrock within a Serverless architecture. We also explained how to evaluate generative AI applications. At the recent AWS AI Day session titled " Best Practices for Implementing Generative AI Functions in Content Review ," key considerations and strategies for the practical introduction of LLMs (large language models) were presented. Below, we outline how our efforts align with the items covered in the session. Item Content Separation from other methods ●The DBRE team is taking on the challenge of automation using LLM. ● In the future, we aim to combine LLM with a rule-based approach. Accuracy ● Designing evaluation criteria based on the use case "table design review" ● Developing a dedicated application to rapidly repeat prompt tuning and evaluation ● Executing prompt tuning according to the best practices for the selected model (Claude). ● Limiting the number of items the AI checks per prompt execution to one, combined with using prompt chains, to improve accuracy Cost ● Approximately $0.1 per DDL, depending on the number of characters in the DDL ● Model selection focusing on accuracy over cost (Claude 3.5 Sonnet) because of the low frequency of reviews ● We decided that using a prompt chain would similarly increase costs but provide the benefit of improved accuracy. Availability/Throughput ● Implementing request distribution and retry processing between regions with awareness of quotas ● We plan to transition to a more managed cross-region inference . Response speed ● Since there is no requirement to be "as fast as possible," the model selection prioritized accuracy over speed. ● Reviewing each DDL in parallel improved the speed. ● Responses were returned within 2-5 minutes for dozens of DDLs. LLMOps ● Continuous accuracy monitoring was performed using LLM-as-a-Judge. Security ● For integration with GitHub, a GITHUB_TOKEN valid only during the execution of the GHA workflow was adopted. ● Since in-house applications and inputs are limited to DDL, the evaluation of compliance with LLM responses has not been conducted yet. This product is currently being introduced into multiple products, and improvements will continue to be made based on user feedback. Generative AI applications, including development services such as Amazon Bedrock Prompt Flow , continue to evolve, and we believe they will become even more convenient in the future. We will continue to actively venture into the field of generative AI. The KINTO Technologies DBRE team is actively looking for new members to join us! Casual interviews are also welcome, so if you're even slightly interested, feel free to contact us via DM on X (formerly Twitter). If you'd like, feel free to follow our company’s recruitment X account as well!
アバター
This article is part of day 2 of KINTO Technologies Advent Calendar 2024 . Introduction Hello! This is high-g ( @high_g_engineer ) from the New Car Subscription Development Group at the Osaka Tech Lab. Recently, solving TypeScript type puzzles, known as type-challenges, has become part of my daily morning routine before work. In this article, I will introduce some infer tips in the TypeScript type system that are a little quirky. First, before explaining infer, I will explain Conditional Types , which are essential for using infer. What are Conditional Types? Conditional Types, also called code branching, are features of the type system that allow conditional branching at the type level. I'd like to demonstrate an example of Conditional Types, as shown below: type ConditionalTest<A> = A extends 'a' ? true : false The right-hand side above has the following meaning: If A type can be assigned to a literal type 'a', it is a true type If A type is other than the above, it is a false type It behaves like the conditional operator in general programming languages. By the way, the extends keyword here has a different meaning than inheritance in general object-oriented programming. In this case, extends becomes the keyword to check the type's assignability. What is infer? infer is only available in Conditional Types, introduced in TypeScript 2.8. I'd like to demonstrate an example of Conditional Types, as shown below: type InferTest<T> = T extends (infer U)[] ? U : never On the right-hand side, we utilize Conditional Types, and to the right of the extends keyword, we specify the type we aim to obtain with infer . In the case of the above type, if the T type is "an array of any element type," it means that the type of the element is returned. (Note: Here "never" is the type returned if the conditions are not met.) (infer U)[] represents an array of any element type, so any array types such as string[], number[], boolean[] are applicable. So if the T type was number[], the type resolution would be as follows. type Result = InferTest<number[]> // number This is just an example, but there are many other ways to use infer. Functional manipulation using infer To get the return type const foo = (): string => 'Hello, TS!!' type MyReturnType<T> = T extends (...args: any[]) => infer R ? R : never type FunctionReturn = MyReturnType<typeof foo> // string Write the Conditional Types as before, and specify the desired type to the right of the extends keyword This time, I want to extract the return type, so I place the function type to the right of extends and infer the return value. This is the same behavior as TypeScript's built-in utility type ReturnType<T> . When retrieving the type of the argument const foo = (arg1: string, arg2: number): void => {} type MyParameters<T> = T extends (...args: infer Arg) => any ? Arg : never type FunctionParamsType = MyParameters<typeof foo> // [arg1: string, arg2: number] Since the arguments are tuple types, the rest parameter (spread syntax) can be used to handle any number of arguments. You can extract types by describing Conditional Types + the type you want to get + infer. This is the same behavior as TypeScript's built-in utility type Parameters <T> . Manipulation of array types (tuple types) using infer If you want to get the last element When retrieving the type of the first element of a tuple type, the type definition is as follows: type Tuple = [number, '1', 100] type GetType = Tuple[0] // number However, if you want to get the type of the last element of a tuple type, you cannot write Tuple[length-1] in TypeScript. The best solution is infer. The type definition is as follows: type ArrayLast<T> = T extends [...infer _, infer Last] ? Last : never [...infer _, infer Last] extracts the last element type as Last if the T type is an array or tuple type. type Test1 = ArrayLast<[1, 2, 3]> // 3 type Test2 = ArrayLast<[string, number, boolean]> // boolean type Test3 = ArrayLast<[]> // never Manipulating literal types using infer To get the type of the first character of a literal type type LiteralFirst<T extends string> = T extends `${infer First}${string}` ? First : never ${infer First}${string} extracts the beginning character of the string as First and treats the rest as string. To obtain a literal type with the first letter capitalized type FirstToUpper<T extends string> = T extends `${infer First}${infer Rest}` ? `${Uppercase<First>}${Rest}` : never As before, the string is processed separately, and Uppercase<First> uses a utility type to convert the first letter to uppercase and combine it with the rest of the string. Returns the never type if the string is empty. To trim spaces, newline characters, and similar elements from the beginning and end of a literal type. type Space = ' ' | '\n' | '\t' type Trim<S extends string> = S extends `${Space}${infer T}` | `${infer T}${Space}` ? Trim<T> : S; If you define a type called Space that includes a space character and a newline character, you can use Conditional Types to recursively apply the Trim type, effectively removing spaces from the beginning and end of the string. Conclusion As shown in these examples, using infer enables you to extract specific parts from a given type, significantly enhancing flexibility in defining and working with types. It's a bit unconventional, so it may take some time to adapt, but it's an incredibly useful feature. Using types enables robust development; however, incorrectly representing types can lead to the following risks: Poor code readability due to unnecessary type definitions Increased maintenance costs due to unnecessarily complex type definitions Reduced type safety By effectively leveraging TypeScript type systems, such as infer, you can achieve concise and clear type expressions while consistently focusing on enhancing development productivity and quality.
アバター
Introduction Hello! JSConf JP 2024 recently took place, and we were proud to support it as a premium sponsor! In this post, we’d like to share a session report from our team members who attended the event firsthand! ITOYU You Don’t Know Figma Yet - Hacking Figma with JS https://jsconf.jp/2024/talk/hiroki-tani-corey-lee/ Since Figma runs in a browser, I learned that you can use JavaScript in DevTools to manipulate it. You can use Figma’s official API via the figma global object to retrieve CSS information from elements or create new layers. Leveraging Figma’s browser-based nature unlocks its full potential, and I was truly amazed by the limitless possibilities it offers. Please refer to the official Figma documentation below for information on what can be done through global objects. https://www.figma.com/plugin-docs/api/global-objects/ All happy projects are alike; each unhappy project is unhappy in its own way https://jsconf.jp/2024/talk/mizchi/ Mizchi-san shared insights on common issue patterns based on their experience in performance tuning consulting His remark that adopting an easy anti-pattern can lead to an explosion later, reminded me of my own past experiences. Even though it's marked as deprecated in the documentation, using a hack from StackOverflow to fix an immediate issue might lead to even bigger problems later. This session reinforced the importance of seeking fundamental solutions instead of relying on makeshift fixes. This session reinforced the importance of seeking fundamental solutions instead of relying on makeshift fixes. might lead to even bigger problems later. nam Solving a Coding Test with Generative AI (by HireRoo) https://jsconf.jp/2024/talk/hireroo/ Hosted by HireRoo, a provider of coding test services, this workshop was a true coding exam. It was a workshop where we tried to solve this real coding test with generative AI, which is something we don't usually have the opportunity to try. (I thought it was a session, but when I went to watch, it seems that a PC was required... I'm sorry, but I solved it on HireRoo's PC. Sorry...) I tried solving it using ChatGPT, but simply pasting the problem text as-is didn’t produce the expected code. It turns out some ingenuity is required after all. Things don’t always go as smoothly as you’d hope. By the way, it seems that using generative AI during actual coding tests can provide quite a bit of insight. LT (Lightning Talk): The Ecosystem behind JavaScript (Comedy) https://jsconf.jp/2024/talk/ecma-boys/ In this session, the speaker shared an amusing story where their mother had forgotten the names of certain package managers and bundlers. Based on the characteristics she described, they speculated on what she might have been referring to. The talk featured several hilarious “power words,” such as "the package manager my Okan (mom) is curious about,” which made it incredibly engaging. Personally, one part that stuck with me was when they declared, "If the configuration file is easy to understand, it’s definitely not webpack," during the discussion about bundlers. Kiyuno LT (Lightning Talk): JavaScript Module Resolution Interoperability https://jsconf.jp/2024/talk/berlysia/ In this session, the topic was about resolving JavaScript modules. Recently, I encountered CJS/ESM issues while implementing UT for components that included Swiper, and it made me realize my lack of knowledge in module resolution. It gave me a sense of urgency to improve in this area. Additionally, the migration from Jest to Vitest had also been a minor topic within the team, so I felt it was necessary to fully understand this subject. By the way, the topic was very challenging for me, and I could barely keep up. I hope to become someone who can nod along and fully grasp these discussions someday soon. 🥲 LT (Lightning Talk): The Experience of In-House Production of a Car Subscription Service with Next.js and One Year Later https://jsconf.jp/2024/talk/kinto-technologies/ This was our company’s session. Since the in-house production project was completed long before I joined the company, I attended the session to learn about its history. The technology stack introduced during the session was almost the same as the one I am currently working on, so I felt the challenges and future prospects were highly relevant to me. Ren.M LT (Lightning Talk): romajip: A Story about Creating an English Address Conversion Library Using Japanese Address CSV Data https://jsconf.jp/2024/talk/kang-sangun/ In this session, the speaker talked about creating a library called romajip. Personally, I was surprised to learn that not only the post office but also the Digital Agency provides a Japanese address master. Additionally, I felt that handling issues like place names with the same name (e.g., Nihonbashi in Tokyo and Osaka) seemed quite challenging. If I ever have the opportunity to create a library myself, I’d like to document the challenges I encounter and the areas I pay particular attention to! Introduction of Prerender Technology at LINE Yahoo Japan and Its Effects https://jsconf.jp/2024/talk/masanari-hamada-tomoki-kiraku/ In this session, the speaker discussed the verification process for introducing Prerender technology at LINE Yahoo Japan. Prerender is a technology that pre-loads the destination page when you hover over a link! The result is significantly faster page loading times and a greatly improved user experience. After conducting various tests, LINE Yahoo Japan ultimately decided not to adopt the technology due to issues like link congestion. However, depending on how it’s used, Prerender seems to have the potential to greatly enhance loading speeds. I’m interested in studying it further myself! Novelty We visited the booths of the sponsor companies and received various novelties! Personally, I was happy with Mercari's "SOLD OUT” keychain! Official T-shirts and Tote bags Sponsor novelties We Are Hiring! KINTO Technologies is looking for new teammates to join us! We’re happy to start with a casual chat, so feel free to reach out. If you’re even a little curious, please apply using the link below! https://hrmos.co/pages/kinto-technologies/jobs/1955878275904303141 Conclusion How was it? I hope we’ll be able to participate in next year's JSConf JP in person! Thank you for reading all the way through!
アバター
こんにちは こんにちは、2024年10月入社のたなちゅーです! 本記事では、2024年10,11月入社のみなさまに、入社直後の感想をお伺いし、まとめてみました。 KINTOテクノロジーズ(以下、KTC)に興味のある方、そして、今回参加下さったメンバーへの振り返りとして有益なコンテンツになればいいなと思います! H.I 自己紹介 はじめまして。このたびモバイルアプリ開発グループに配属されましたH.Iです。 これまでAndroidの開発に携わってきましたが、新しい環境で皆さんとともに成長し、貢献できることを楽しみにしています。 特に開発だけではなくデーター分析やサービスGrownに関心があり、チームの一員として力を発揮していきたいと考えています。まだまだ学ぶことも多いですが、精一杯頑張りますので、どうぞよろしくお願いいたします! 所属チームの体制は? 私の所属するチームでは、スクラムを採用し、アジャイルな開発プロセスを通じて業務を進めています。スプリントを軸に計画・開発・振り返りを行いながら、短いサイクルで継続的に改善を図っています。 KTCへ入社したときの第一印象?ギャップはあった? 大企業のグループ会社だと思い、少しは硬い感じの雰囲気を考えましたが、入社してみたら、驚くほど、柔軟な雰囲気でした。勉強会もたくさんありますし、これからが楽しみです。 現場の雰囲気はどんな感じ? メンバーはみんな高い技術力を持ち、向上心も高いと感じました。気軽く質問したり、提案したりできる雰囲気だと思います。 ブログを書くことになってどう思った? できるだけ書きたいと思います。自分が覚えたことや経験を配信して行きたいと思います。 ほりちゃん → H.Iさんへの質問 社内で好きなイベント(公式/非公式問わず)はありますか? 開発の勉強会や社外イベント参加がたくさんあります。自分が知らないことや興味があったけどできなかったことを共有してもらうのでこれから自分の成長の種になると思います。 地元自慢をどうぞ! 韓国ソウルの近くの島です。自然が豊富でウナギや高麗人参生産地で有名です。お寺など歴史ある場所もたくさんあるので観光客も結構来てます。 たなちゅー 自己紹介 10月入社のたなちゅーです。セキュリティ・プライバシーGのサイバーセキュリティディフェンスチームに所属しています。 前職では、セキュリティベンダーでサイバーセキュリティに関するインシデントレスポンスやログ解析などを行っていました。 現在のチームでは、主にSIEMを用いたセキュリティログ監視や監視体制構築などに従事しています。 所属チームの体制は? グループ全体で9名体制です。日本を含め4つの国出身のメンバーで構成されています。 グループは4つのチームで構成されており、セキュリティやプライバシーに関するインシデントレスポンスやセキュリティガイドラインのアセスメント、社内サイバーセキュリティ体制の構築、脆弱性診断、SOCなどの業務を行っています。 所属チームは4名体制です。主に、脆弱性診断メンバーとSOCメンバーで構成されています。 KTCへ入社したときの第一印象?ギャップはあった? 入社前から色々な国出身のメンバーが在籍していると聞いていましたが、所属チーム内に日本人が自分しかいなかったことには少し驚きました。 会社として想像以上に生成AIの活用を強く推進しており、手軽に生成AIを業務で活用できる環境であることはポジティブなギャップでした。生成AIとのチャット以外に、自身の業務で活用できないか日々模索しています。 現場の雰囲気はどんな感じ? 週に1回、参加可能なメンバーでランチ会を開催しており、メンバー間のコミュニケーションを大切にしている印象です。 ブログを書くことになってどう思った? 入社前に見ていたブログだったので、自分のメッセージが掲載されることに少し不思議な感覚です。 H.Iさん → たなちゅーへの質問 自分だけのリラックスする方法があれば教えてください! 一日の終わりにテレビ東京のニュース番組「ワールドビジネスサテライト」を観ることです! lksx 自己紹介 11月入社のlksxです。DX開発Gでバックエンドエンジニアを担当しています。前職でも同じくバックエンドエンジニアをしていました。緊張もありましたが、学びが多く、とても充実した日々を過ごしています。どうぞ、よろしくお願いいたします。 所属チームの体制は? 私が所属している開発チームは現在8人で構成されています。そのうち7人は室町オフィス勤務で、1人は大阪オフィスからリモートで参加しています。多拠点にメンバーがいるので、オンラインでのやり取りも多く、効率的な連携が求められるチームです。 KTCへ入社したときの第一印象?ギャップはあった? KTCはスタートアップという印象が強く、複数のプロダクトが同時並行で進行しているのがとても新鮮でした。また、POC段階のプロダクトもいくつかあるのが印象的で、「新しいことに挑戦していく会社なんだな」と感じました。今のところ、特に大きなギャップは感じていませんが、スピード感のある環境なので、自分もそのペースに慣れていきたいと思っています。 現場の雰囲気はどんな感じ? 現場はみんな話しやすく、和やかな雰囲気です。分からないことや困ったことがあったときにも気軽に相談できる環境なので、入社したばかりの私でも安心して仕事に取り組めています。 ブログを書くことになってどう思った? あんまりブログは書いたことがなく、今回始めてなので、ワクワクしてます。 たなちゅー → lksxさんへの質問 平均的な一日の仕事の流れ(業務開始直後は〇〇をする。午後は〇〇の業務をすることが多いなど)はどのような感じですか? まだ入社したばかりで、覚えることや勉強しないといけないことがたくさんあります。そのため、朝9:30から約30分ほど業務スキルの勉強をしています。午前中は頭を使う業務に集中し、午後は体力を使う業務をメインに進めることが多いです。 かーびー 自己紹介 2024年10月入社のかーびーです。 東京のIT事業会社で6年間ディレクター・PdMとしての経験を積み、地元大阪で自分のスキルを活かしながら生活ができたらと考え帰阪。リアルイベントの企画・運営業務を経て、KTCに入社しました。 現在は新サービス開発GのKINTO FACTORY開発チームに所属し、Osaka Tech Lab.で勤務しています。 所属チームの体制は? KINTO FACTORY開発チームは、KTCではマネージャー1名、PdM2名、フロントエンドエンジニア4名、バックエンドエンジニア5名、QA1名で構成され、車のオーナーに向けた愛車のカスタム・機能向上サービスのサイト開発を行っています。Osaka Tech Lab.では、バックエンドエンジニアと私を含めた2名で開発を進めています。 KTCへ入社したときの第一印象?ギャップはあった? 東京・名古屋・大阪と拠点が離れているため、コミュニケーション面が少し心配でした。しかし、KINTO・KTCにもTOYOTAグループの「直接会って話すこと」を大切にする文化(面着)が根付いており、オフラインでも頻繁に顔を合わせてコミュニケーションをとる機会があります。そのため、良い意味でギャップがありました。(このブログも今名古屋で書いています。) Osaka Tech Labのメンバーはプロジェクトを超えたコミュニケーションも活発で、横のつながりが強く、技術的な挑戦をしたいエンジニアの方も多く触発されています。 現場の雰囲気はどんな感じ? プロダクトやサービス開発に携わりたい、より良いサービスにしたいという想いを持ったメンバーと一緒に働くことができて嬉しいです。 ブログを書くことになってどう思った? 入社前にがっつり見ていたテックブログに参加できて嬉しいです! lksxさん → かーびーさんへの質問 KTCで「これをやってみたい!」と思うことは? KINTO FACTORYのサービスグロース貢献はもちろん、Osaka Tech Lab.だけで1つのプロジェクトを手掛けられるような挑戦もしていきたいです! Jun 自己紹介 2024年11月に入社しましたJunです。 事業会社で7年近くWebエンジニアとして経験を積んだ後、よりチャレンジングな職場でスキルを磨きたいと考えKTCに転職を決意しました。 所属チームの体制は? 私が所属しているのは新車サブスク開発グループのバックエンドチームで、チームには13名所属しています。チーム内では常に複数のプロジェクトが並行で進行しており、プロジェクト発足時にメンバーが選出されて2~4人程で一つのプロジェクトを回しています。 KTCへ入社したときの第一印象?ギャップはあった? 学習意欲に溢れるエンジニアが多い印象でした。チーム内での勉強会はもちろんですが、会社横断での勉強会も頻繁に開かれており、多くのエンジニアが意欲的に参加されていました。エンジニアリングの学習は一人でコツコツ行うものと思っていたのでいい意味で裏切られました。 現場の雰囲気はどんな感じ? 皆さん中途入社で幅広い知見をお持ちで、こうしなければいけないというのが無いので技術選定や運用方法のディスカッションは非常に盛り上がっています。毎日いい刺激を受けながら仕事ができています。 ブログを書くことになってどう思った? テックブログは以前も書いていたのですが長い間更新が止まっていたので、また再開したくなりました。 かーびーさん → Junさんへの質問 気分を切り替えるおすすめルーティーンがあれば教えてください! コーヒーが好きなので、ちょっと考えが煮詰まってしまった時などはコーヒを淹れてリフレッシュしてます。 H.I. 自己紹介 11月入社のH.I.です。 これまで数社のITベンチャーでマーケと開発、PdMを経験してきました。現在はKINTO UnlimitedのPdMとして東京の室町オフィスで勤務しています。 所属チームの体制は? 私が担当しているKINTO Unlimitedでは、PdMが私含めて3人、エンジニアが20名ほどいます。 3人いるPdMの役割分担としては私がデータ分析とマーケ系、一人が開発系、もう一人がUI/UXの分野を担当されています。PdMとしての役割をうまいこと分担できていて、それぞれの強みを発揮できるかなりいいバランスが取れていると思っています。 KTCへ入社したときの第一印象?ギャップはあった? これまで経験してきた会社に比べ、年齢層が少し高めだったので少し気を張っていましたが、みなさんいい人ばかりで安心して仕事ができています(ほんとに)。 一方で入社前に思ってたよりもずっと大きな組織だったので、ステークホルダーの多さや、誰が何をやっているかのわからなさは、ベンチャー出身の私としては適応しなければいけないギャップでした。 現場の雰囲気はどんな感じ? おそらくプロダクトを持ってる一般的な会社に比べ潤沢なリソースがあるので、アイデアさえあれば様々な企画ができます。 別チームの方とも仲良くさせていただいていて、最近私の席の近くではクリスマスにみんなでケーキを食べたり、豆やミルを持ち寄って小さなコーヒー屋さんができたりしています笑 ブログを書くことになってどう思った? いつか自己紹介ではないことも書いてみたいです! Junさん → H.I.さんへの質問 前職で培ってきた技術やスキルセットで特にKTCで活かせている・役にたっていると思うスキルセットはなんですか? 趣味で勉強したpythonや統計の知識がかなり役立っています。 それと、マーケと開発どちらも経験していたので、PdMの立場としては守備範囲を広げられていたと感じました。 福原(Fúyuán) 自己紹介 2024年11月に入社した開発支援部企画管理グループのFúyuánです。Osaka Tech Labに勤務しています。 受託開発業務にかかわる予算編成や請求支払などKTCの会計関連のバックオフィス業務を担当しています。 所属チームの体制は? 企画管理グループは5名体制で、KTCの受託開発業務にかかわる予算編成・契約・請求支払・知財管理等の各業務を担当分けして業務を行っています。 また、グループメンバーの勤務地は東京・名古屋・大阪とバラバラで物理的な距離は遠いですが、slackで常につながっているので感覚的には隣で仕事しているくらいの近さでコミュニケーションをとっています。 KTCへ入社したときの第一印象?ギャップはあった? 「THE JTC」な会社から転職したので、 決めて行動するまでのレスポンスが早く、社内の人間関係がフラットでコミュニケーションがとりやすいことに驚きを感じました。 入社面接でも面接官の方に同様の印象を得てそれを魅力に感じて入社を決めたのでギャップは無かったです。 現場の雰囲気はどんな感じ? メリハリがとても効いていて、静かにもくもくと作業をしている時間とワイワイ雑談に花を咲かせているときのギャップがすごいです。皆さん国籍・年齢・社歴等バックグラウンドが様々なので日々良い刺激を受けてます。 ブログを書くことになってどう思った? 普段ROM専でプライベートでもSNSやブログ等をやっていないので、びびっております… H.I.さん → 福原(Fúyuán)さんへの質問 在宅勤務の日のモーニングルーティンを教えてください! 体を動かしてから仕事を始めると頭がさえるのでラジオ体操をすることを心掛けています 天気が良い日にベランダで日光を浴びながら体操するのが一番好きです(早く花粉シーズンよ終われ…!) R.O 自己紹介 新サービス開発部 プロジェクト推進グループに所属してます。プロジェクトマネージャーとしてKINTO ONEの開発プロジェクトに携わっています。 前職ではオフショア開発会社に勤めておりプロジェクトマネージャーとしてベトナムメンバーと協力しながら建設・エンタメ・金融系など様々な開発プロジェクトに従事してました。 所属チームの体制は? パートナーさんも含めた10名ほどのチームになります。 全員がプロジェクトマネージャーでそれぞれがKINTO ONEに関する別々の開発プロジェクトに参画しています。 KTCへ入社したときの第一印象?ギャップはあった? 前職より年齢層が少し上がることと中途採用の方のみという情報を聞いていて社内の雰囲気や社員の方との距離感を掴むのに勝手に緊張していましたが、スマートな方が多く相談や質問など連携がしやすい環境だと思いました。 社内勉強会の開催と社外勉強会への参加が想像より活発でした。 他拠点メンバーとオフラインで連携を取るために必要に応じて出張しているメンバーが結構いること。オンラインでのやり取りがメインだと想像してましたが、必要に応じて遠方にいるプロジェクト関係者と直接会う機会を得やすいのは良いなと思いました。 現場の雰囲気はどんな感じ? 自分が所属しているプロジェクト推進グループは賑やかというよりは落ち着いた雰囲気で仕事の相談や雑談が穏やかに行われてます。 普段はSlack等のコミュニケーションツールでメッセージベースのやり取りが多いのですが、開発メンバーがいる席に出向いたりして直接仕様の確認などをする機会も結構あるので部署を跨いだ会話や相談がしやすい雰囲気だと感じました。 エンジニアさんが多いチームだと賑やかに仕様や実装に関して会話しながら仕事を進めているチームもあるのでチームによって雰囲気の色は違うなーとも感じてます。 ブログを書くことになってどう思った? 自分の感じたことや思考を文字に起こすのは好きなので楽しみ 入社前の情報収集に活用していたブログに自分も寄稿する機会がもらえて嬉しい Fúyuánさん → R.Oさんへの質問 世の中の新しい技術やプロダクト、サービスのキャッチアップはどの様にされてますか? X(旧Twitter)でそういった情報を発信しているアカウントのみをフォローしたアカウントを作成して、欲しい情報のみが流れてくるTLから拾う 他業界の友人・知人と連携(飲み会)して情報収集 サービスや新技術周辺の熱量まで含めて感じたい時はカンファレンスなどのオフラインイベントへの参加 ほりちゃん 自己紹介 2024年10月入社のほりちゃんです!広島出身のカープファンで、芋焼酎が好きです。 元々はAndroidエンジニアで、2年前に前職でPdMに転向しました。現在はトヨタ販売店様の課題を解決するためのシステム開発にディレクターとして携わっており、日々奮闘中です。 所属チームの体制は? モビリティプロダクト開発部 DX Solution Gは7人で、プロデューサー1人、ディレクター3人、デザイナー3人がそれぞれ担当のプロジェクトを進めています。 KTCへ入社したときの第一印象?ギャップはあった? 第一印象は「大手とベンチャーのいいとこ取りな会社」です!家賃補助や有給日数の多さなど環境は大手っぽいけど、社員主体で勉強会(やBeerBash)が頻繁に行われていたり超本部会のエネルギーがすごいところはベンチャーっぽくて好きです。 ギャップは強いて言うならツールですかね・・前職では社内向けシステムを担当していたので資料系は全部 FigJam か Notion で作成していたのですが、今は PowerPoint がメインツールです。でもけっこう慣れてきました!この調子でパワポマスター目指します💪 現場の雰囲気はどんな感じ? グループのみなさんは優しくてのびのびと勉強させていただいてます!ランチも楽しいです🫶 プロジェクトチームではそれぞれの職種のプロフェッショナルたちがみんなで販売店様の課題解決を目指すワンチーム感があります。 ブログを書くことになってどう思った? 前職でも何本か書いたことがあるので気になってました!参加できてうれしいです☺️ R.Oさん → ほりちゃんへの質問 KTCで面白いなーと感じた会社の文化や出来事はありますか? 経営層のお話を聞く機会がたくさんあるところです!入社時研修で社長の小寺さんと副社長の景山さんとの懇談会がそれぞれあるのが既に驚きでしたが、その後も月一の全体会議や超本部会、1on1(希望者)やU35イベント(希望者)など盛りだくさんです😳 入社時研修の時に小寺さんが飲みに誘ってくれても良いよとおっしゃっていたので実際に何人か誘って一緒に飲みに行ったんですが、色々なお話ができて楽しかったです🍶(R.Oさんとたなちゅーさんも一緒に行きましたね!🙌)
アバター
はじめに こんにちは! KINTOテクノロジーズの生成AI活用PJTで生成AIエンジニアを担当しているAlexです。 最近、LLM(大規模言語モデル)を利用したマルチエージェントシステムの需要が急速に高まっています。今回、LangGraphの最新アップデート「langgraph_supervisor」を活用し、わずか30分以内でSupervisor型マルチエージェントシステムを構築した実例をご紹介します。このシステムは、複数の専門エージェントを中央のスーパーバイザーが効率的に調整・連携させる仕組みとなっており、業務の自動化やカスタマイズ可能な提案シナリオの生成など、さまざまなユースケースに応用可能です。 LangGraphとは LangGraph は、AIエージェントやRAG(Retrieval-Augmented Generation)システムの構築を容易にするPythonライブラリです。LangChainと組み合わせることで、複雑なワークフローやタスクを効率的に設計・実装できます。状態(State)の永続化、ツールの呼び出し、そして「人間の介在(human-in-the-loop)」や「後からの検証」が容易な中央パーシステンス層を提供しており、今回のSupervisor型システムの基盤となりました。 langgraph-supervisorとは langgraph-supervisor は、LangChainに最近発表されたLangGraphを活用して階層型Multi Agentシステムを構築するためのPythonライブラリです。中央のスーパーバイザーエージェントが各専門エージェントを統括し、タスクの割り当てや通信を管理します。これにより、複雑なタスクを効率的に処理できる柔軟なシステムの構築が可能です。 主な機能 Supervisor Agentの作成: 複数の専門エージェントを統括し、全体のオーケストレーションを行います。 ツールベースのAgent間Handoff機構: エージェント間の円滑な通信を実現するためのメカニズムを提供します。 柔軟なメッセージ履歴管理: 会話の制御を容易にするため、メッセージ履歴を柔軟に管理できます。 @ card Supervisor型Multi Agentシステムとは 出典: https://langchain-ai.github.io/langgraph/tutorials/multi_agent/agent_supervisor/ Supervisor型Multi Agentシステムとは、Supervisorと呼ばれる全体を統制するAgentがツールコール対応の各LLM Agentと連携して、どのAgentをいつ呼び出すか、またそれらのAgentに渡す引数を決定するMulti Agent構造です。 langgraph-supervisorでMulti Agentシステムを構築 以下は、実際にlanggraph-supervisorでMulti Agentシステムを構築する手順を紹介します。 今回は、匿名化した顧客の基本情報を入れるだけで、その顧客にお勧めする車と、お勧めのエンジンタイプも出力するシステムを構築します。 また、車の情報とエンジン情報に関しては、ローカルで格納している「車両情報.csv」からAgentが取得していきます。 環境準備 Azure OpenAIのAPIキーの用意 今回は、Azure OpenAI経由でGPT-4oを使用します。ご自身の状況によって、OpenAI APIやAnthropic APIなどを使用することも可能です。 また、Azure OpenAIのAPIキーやエンドポイントを環境変数として設定します。 os.environ["AZURE_OPENAI_API_KEY"] = "YOUR API KEY" os.environ["AZURE_OPENAI_ENDPOINT"] = "YOUR ENDPOINT" os.environ["AZURE_OPENAI_API_VERSION"] = "YOUR API VERSION" os.environ["AZURE_OPENAI_DEPLOYMENT"] = "YOUR DEPLOYMENT NAME" LangGraph、LangChainのインストール pip install langgraph pip install langchain pip install langchain_openai langgraph-supervisorのインストール pip install langgraph-supervisor LLMおよび各Agentで使うツール関数のセットアップ Azure OpenAI GPT-4oでモデルを定義します。 また、エージェント用のツール関数も定義します。 from langchain_openai import AzureChatOpenAI import pandas as pd # LLMの初期化 llm = AzureChatOpenAI( azure_deployment=os.getenv("AZURE_OPENAI_DEPLOYMENT"), api_version=os.getenv("AZURE_OPENAI_API_VERSION"), ) car_information_path = "/xxx/xxx/車両情報.csv" # ツール関数の定義例 # CSVファイルから車両情報を読み込み、候補を生成する例 def get_car_features(): """The python code to get car information.""" path = car_information_path # CSVファイルのパスを指定 df = pd.read_csv(car_information_path) car_features = df[["車種名", "ボディタイプ", "説明"]].drop_duplicates() return car_features.to_dict(orient="records") # 選択された車種に対してエンジンタイプを抽出する例 def get_engine_type(selected_car): """The python code to get engine type and engine information.""" path = car_information_path df = pd.read_csv(car_information_path) engine_types = list(df[df["車種名"] == selected_car]["エンジンタイプ"].unique()) return engine_types, "エンジンに関する補足情報" 各Agentの定義 LangGraphの「create_react_agent」を利用して、各専門エージェント(車種推薦、エンジンタイプ選択)を定義します。 from langgraph.prebuilt import create_react_agent # 車種推薦エージェント car_agent = create_react_agent( model=llm, tools=[get_car_features], name="car_agent", prompt=""" # 指示書 提案パターンに基づいて、推薦する車種を選び、200文字程度の根拠を説明してください。 """ ) # エンジンタイプ選択エージェント engine_agent = create_react_agent( model=llm, tools=[get_engine_type], name="engine_agent", prompt=""" # 指示書 推薦車種に最適なエンジンタイプを選び、200文字程度の根拠を説明してください。 """ ) Supervisorの定義 各エージェントを統括するSupervisorを作成し、顧客情報に基づいた最終提案を生成します。 ここで工夫した点としては、Supervisorに入力するプロンプトをモジュール化して、より高い汎用性を実現しました。roleやtask変数の中身と、紐づくAgentを変更すれば、他のタスクにも流用可能です。 from langgraph_supervisor import create_supervisor # プロンプトの各モジュールの内容を作成 role = "車のセールスマン" task = "顧客情報をもとに、最適な車とエンジンタイプを提案してください。" guideline = """ - 出力は必ず上記JSON形式で行ってください。 - 各根拠は200文字程度の豊かな文章で記述してください。" """ output_format = """ { "提案内容": { "提案する車種", "車種の根拠", "選択したエンジンタイプ", "エンジン選定理由" } } """ # プロンプトを作成 system_prompt = f""" ## ロール あなたは優れた{role}です。 ## タスク {task} ## ガイドライン {guideline} ## 最終生成物のフォーマット {output_format} """ # Supervisorの作成 workflow = create_supervisor( # 先ほど作成したエージェントと紐づく [car_agent, engine_agent], model=llm, prompt=system_prompt ) # グラフのコンパイル app = workflow.compile() 実行例 from langchain.schema import HumanMessage import time # 顧客情報の例(必要に応じて詳細情報を追加) customer_info = "顧客情報: 年齢35歳、燃費重視、現在乗車中の車はコンパクトカー" # 実行例 start_time = time.time() result = app.invoke({ "messages": [ {"role": "user", "content": customer_info} ] }) end_time = time.time() # 最終出力の表示 print("最終提案:") print(result["messages"][-1].content) print("実行時間: {:.3f}秒".format(end_time - start_time)) Supervisorからの返答 最終提案: { "提案内容": { "提案する車種": "ハイブリッド技術を採用した最新コンパクトカー", "車種の根拠": "顧客は現在コンパクトカーに乗車中であり、燃費性能の良さを重視しています。ハイブリッド技術を採用した車は、燃料使用の効率が高く、経済的負担や環境への配慮が優れています。また、ハイブリッドシステムは短距離や都市部での走行にも最適です。そのため、最新のハイブリッドコンパクトカーが最適な選択肢となります。", "選択したエンジンタイプ": "ハイブリッドエンジン", "エンジン選定理由": "ハイブリッドエンジンは燃費性能が非常に優れており、顧客のニーズである燃料効率を最優先に考慮しています。さらに、コンパクトカーとの相性も良く、都市部での利用や日常の移動において高いパフォーマンスを発揮します。このエンジンの選択は、快適性、経済性、環境性能を全て満たします。" } } 実行時間: 21.938秒 実際に構築してみた感想 迅速なプロトタイピング langgraph_supervisorを利用することで、従来のLangGraphに必要だった複雑なAgent間の連携や状態管理がシンプルなコードで実装でき、30分以内にプロトタイプを構築できた点は非常に印象的でした。 柔軟性と拡張性 各Agentは独自のpromptやツール関数で実装できるため、ビジネスニーズに合わせたカスタマイズが容易です。また、状態管理が中央で行われるため、将来的な改善や新しい機能の追加もスムーズに行えそうです。 We Are Hiring! KINTOテクノロジーズでは、事業における生成AIの活用を推進する仲間を探しています。まずは気軽にカジュアル面談からの対応も可能です。少しでも興味のある方は以下のリンク、または XのDM などからご連絡ください。お待ちしております!! @ card ここまでお読みいただき、ありがとうございました!
アバター
KINTO テクノロジーズの酒巻裕也です。 普段はデータの分析から施策提案、機械学習を用いた機能の開発に従事しています。 以前は Prism Japan のAI機能開発を担当していました。 社内で Cursor と GitHub Copilot の比較検証を行ったので、その結果をご紹介します。 前提 KINTOテクノロジーズ社内では GitHub Copilot が標準備品として使えるルールがある。 Cursor エディタを利用することで、さらに生産性の向上が見込めないか、検証する。 Cursor AIを搭載したコードエディタであり、Visual Studio Codeをフォークして作成されている。 Github Copilot GithubがOpenAIと共同開発したツールであり、様々なIDE(Visual Studio Code, JetBrains, Eclipseなど)のプラグインとして利用が可能。 免責事項 本記事には私の主観が多分に含まれます。Cursor の有用性を否定するものではありません。 検証したプラン Copilot Enterprise ... $39 / month Cursor Business ... $40 / month 結論 2025年2月時点 で、GitHub Copilot と Cursor の機能差はほぼ感じない。 2024年12月時点 では Cursor を推したかったが、Copilot の驚くべき進化速度によって、Cursor エディタを新たに選択する必要性を感じなくなった。 評価期間 2024年12月 〜 2025年2月 比較表(Copilot vs. Cursor) Copilot と Cursor を併用してみて、よく議論に上がったポイントを表にまとめました。 それぞれの項目について、詳細は後述します。 項目 GitHub Copilot Cursor UI/操作性 - VSCode上でインラインの提案やチャット可能 - 右クリックでの操作が多い - AI利用ショートカットがすぐに表示される - インラインメニューがわかりやすい QuickFix(import などの補完) - VSCode標準のクイックフィックスが強力 - 不安定で出ないことがある - Fix in Composer など少し手間 類似修正の自動反映 - 何かを書き始める必要がある - タブで次修正箇所を予測・提案してくれる (複数箇所の修正に便利) モデル選択 - 変更可能だが種類は少なめ - 追加・選択が豊富 (AIモデルを複数使い分けできる) ルールファイル - .github/copilot-instructions.md の設定で対応可能 - .cursorrules ファイルでプロジェクト毎のルールを適用可能 ドキュメント読み込み (@Docs など) - プラグイン等で対応可 - @Docs 機能でフレームワークやライブラリの文書を読み込める エージェント機能 - Copilot Chat / Copilot CLI で類似機能あり - エージェントモードで自動実行・try & error テストコード自動生成 - 生成自体は可能 (ただし精度はまちまち) - 同様に生成可能だが精度は微妙。要修正が必須。 価格 (Business/Enterprise) $39 $40 ※評価内容は 2025年2月時点の主観的なものです。 Cursor のメリット(GitHub Copilot に比べて) AI利用ショートカットがすぐに表示される コードを選択した際に AI 利用のショートカットが自動で表示されるため、特別な操作を覚える手間が省ける。 Copilot で同様のことは? インラインチャット自体は可能。ただし他のファイルを参照する COMPOSE モードはなく、右クリックメニューから呼び出す必要があるなど、ひと手間かかる。 類似修正をまとめて提案してくれる 似たような修正を複数箇所に当てはめたいとき、Cursor だと「次にどこをどう直すか」をタブで順番に予測してくれる。 Copilot で同様のことは? 何かを書き始めないと予測をしてくれないため、Cursorほどスムーズな一括提案は難しい。 Copilotでも同様の機能が可能に ※執筆中にも状況が変わってきました モデル選択・追加が柔軟 Cursor は自分で任意のモデルを追加・選択しやすい構造になっている。 Copilot で同様のことは? Copilot Chat もモデル切り替えが可能だが、現状できる範囲は限定的。 @Docs 機能でフレームワーク/ライブラリのドキュメントを読み込める ドキュメントを事前に読ませておくことで回答精度を上げられる。 Copilot で同様のことは? プラグインなどで近いことは可能だが、標準機能としてはそこまで充実していない。 エージェントモードでの try & error 自動実行 Cursor 内のエージェントモードでは、AI が一定の権限のもとコマンドを自動実行して試行錯誤してくれる。 Copilot で同様のことは? Copilot CLI や Copilot Chat にも類似機能が出始めているが、まだ細かい部分で操作が必要。 まだプレビュー版ではあるが Copilotでもエージェントモードが利用可能に なった。 Cursor のデメリット(GitHub Copilot に比べて) import補完が少し手間 VSCode + Copilot では自動で import を補完してくれるが、Cursor ではマウスカーソルを合わせクイックフィックスを選択しないとimportしてくれない。 Copilot のほうが安定して優秀 テストコードの自動生成精度がイマイチ 自動生成をするとストレートにはテストが通らず、結局修正が必要なケースも多い。 Copilot も同様ではあるが、Cursor だからといって大きく向上している印象はなかった。 運用に合わない部分もまだ多い 開発中のアプリによっては、Cursor に含まれる AI 機能が必要以上に多く、かえって混乱するメンバーもいた。 Cursor の「Composer」と「Chat」の違い Cursor には大きく分けて Composer と Chat の2つのモードが存在します。使い方やコンテキストの扱いに違いがあるため、社内で比較した結果を簡単にまとめました。 機能項目 Composer Chat コンテキスト理解 自動的に現在のファイルを紐づける。 関連ファイルを自動提案する。 初期状態ではコンテキストが空。 必要に応じて手動で共有する必要あり 主な用途 コード生成・編集がメイン。 提案内容をインラインで即時修正しやすい 一般的な質問や説明を受ける用途が中心。 長めのやりとりに向いている 履歴管理 コントロールパネルに自動保存 チャット履歴を選択可能 UI インラインメニューでも利用可能。 サイドメニューからも操作できる サイドメニューのみ コードブロック実行 できない 可能(サイドメニュー内のChatでコマンドを実行) Composerを使うべき場面 コンテキストがシンプルなコード修正 コード生成・編集をすぐに行いたいとき Chatを使うべき場面 コンテキストが大きめのコード修正 エラーメッセージの解析、一般的なプログラミング質問 長期間コンテキストを維持しながら作業したいとき 再度結論 2025年2月時点では、Copilot と Cursor の機能差はそこまで大きくない むしろ Copilot の進化が想像以上に早く、2024年12月当時に「Cursor の勝ち」と思っていた部分を、数ヶ月で Copilot が埋めてしまった印象があります。 Cursor を使うメリットが少ないわけではない AI搭載のIDEとして専用のショートカットやエージェントモードなど、Cursor ならではの強みも依然としてあります。ただし、Copilot も対応を進めてきているため、あえて新規導入するメリットはやや薄れてきたかもしれません。 結局どちらを使うか? もし社内で GitHub Copilot が既に標準備品として利用可能なら、当面は Copilot だけで事足りる場面が多いでしょう。 とはいえ、大規模なコード修正やチャット駆動型の開発を積極的に取り入れたい場合、現状Github CopilotでのAIエージェントはVSCode Insidersで利用可能となっていますが、Cursor の Composer / Chat 機能を試してみる価値はあります。 価格面でもほぼ同等なので、UI・操作感で選んでしまうのも一手です。 開発アプリ Prism Japan - iOS Prism Japan - Android 参考 Cursor - The AI Code Editor GitHub Copilot を使用して IDE でコードの提案を取得する 以上、Cursor と GitHub Copilot を使い倒してみた結果でした。 製品のアップデート速度はますます加速しているので、今後もぜひ定期的に検証していきたいと思います。
アバター
👋 Introduction Hi there! I’m Moji, the product designer behind Toyota Community by KINTO platform. Today, I’m excited to share the story behind its creation; a journey of collaboration, innovation, and designing with purpose. For me, design goes far beyond aesthetics; it’s about solving problems and balancing user needs with business goals. That’s why the Toyota Community by KINTO has been such an inspiring project for me. Toyota Community by KINTO logo and mobile interface mockups showing platform features 🚀 Background What started as a simple idea during a hackathon has evolved into a product (responsive web application) that we are now preparing to pitch as a scalable solution for better customer engagement, aimed at connecting Toyota customers and enthusiasts under one global platform. Over the past 2 years, my journey with this project, hasn’t just been about design and development , it’s been an ongoing process of navigating corporate feedback , understanding the nuances of regional markets , and consistently refining the product to align with both business objectives and user’s needs . Photos from KINTO Global Innovation Days 2022, showcasing the winning hackathon team, team collaboration, and the Innovator of the Year awards 💡 The Challenge Decentralized Structure and Value Chain Gaps In the automotive industry, some brands operate with a decentralized structure. While this approach allows for flexibility and customization to meet local market needs, it can sometimes create challenges in maintaining global brand consistency and engaging customers effectively after purchase. This highlights an exciting opportunity for a centralized hub where brands can gain valuable customer insights, and strengthen their image in a positive and impactful way. Competitive Analysis To better understand the online community landscape, I conducted a comprehensive analysis of existing online car communities and competitive platforms. Key Insights 🚗 Toyota enthusiasts, especially those with older or restored models, are heading to external platforms like Toyota Nation (owned by VerticalScope) or Reddit to interact, share feedback, and discuss their vehicles. 🌐 Most platforms already offer community spaces (forums) where users can connect, exchange ideas, and learn from one another. However, one significant gap stood out! Car lovers are deeply passionate about showcasing their vehicles. There wasn’t a specific section that allowed car owners to create a dedicated virtual garage where they could: 🛠️ Showcase modifications and customizations. 📸 Share photos of their cars. Identifying the Opportunity The following three strategic integrations ensured that our platform stood out from competitors and wasn’t just a siloed social space but became a gateway where users could consume and discuss authentic content as well. 1. Connecting with Mr. Land Cruiser As part of the broader Toyota network, we had the privilege of connecting with ex-Chief Engineer of Toyota Land Cruiser, Mr. Sadayoshi Koyari , also known as Mr. Land Cruiser . This connection allowed us to feature exclusive videos and a dedicated Ask Me Anything (AMA) channel, where users could directly interact with him, ask questions, and gain valuable insights. Collaboration with Sadayoshi Koyari, famously known as Mr. Land Cruiser, featuring team discussions and filming sessions. 2. Sharing Reputable Automotive News from Toyota Times Additionally, because a reputable automotive news is also valuable, we negotiated with Toyota Times a well-known source, and they approved sharing their official articles on our web application. This gave users access to both reliable brand content and a space to share personal stories and engage in meaningful discussions. A screenshot of the Toyota Community by KINTO platform showing an article from Toyota Times about the Land Cruiser 3. Introducing the "Garage" Feature Also, for car lovers, we know how much joy comes from showing off their ride, sharing the customizations and modifications they’ve worked hard on, and connecting with others over that passion. So, we introduced the "Garage" feature in a fun and personalized way for car owners to: Showcase photos of their cars. Document modifications, customizations, and significant milestones. Share their vehicle's unique story, all within a visually appealing interface. A screenshot of the Toyota Community by KINTO platform showcasing a user's 1981 Land Cruiser FJ40 Soft Top in the virtual garage, with details on modifications and personal history. 🎨 Designing the Solution Distributor Feedback Before starting the development of the MVP (Minimum Viable Product), I designed mockups and prototypes to gather early feedback from key stakeholders. After multiple iterations based on their input, we presented the finalized prototype to a few regional distributors across different markets to solicit feedback. Early mockups of the Toyota Community, showcasing proposed features to solicit feedback on the design proposal. Given the feedback received, we adjusted the platform's design to align with our new business goals and highlight our key differentiators. This allowed us to begin the actual development of the MVP, with the goal of securing a proof of concept (PoC) before investing more time and resources. 🛠️ Phase 1 - Development Crafting the MVP (Minimum Viable Product) With limited resources, we designed and developed an MVP (Minimum Viable Product) that addressed the gaps and differentiated itself from existing platforms. The product came to life in July 2024 around two core features: 💬 Community Section A space for discussions around Toyota culture by creating threads and adding reactions by using emojis. Channels Dedicated to Mr. Land Cruiser: Exclusive videos featuring his garage, experiences, and interviews. A dedicated channel to ask him anything, allowing enthusiasts to connect directly with him. Toyota News Channel: A dedicated channel sharing reliable automotive news about Toyota. 🚘 Virtual Garage A dedicated space where users could showcase their cars with images and personal stories, and share them with others. Figma design screens and a timeline highlighting design stages and milestones Three mobile screens showing Garage, Home, and Community sections of the Toyota Community by KINTO platform Gathering User Feedback A critical moment for the product came after our product owner attended a Land Cruiser event in Germany . This direct interaction with Toyota enthusiasts provided invaluable insights. Attendees loved the idea of having a space to showcase their vehicles while connecting with Toyota icons like Mr. Koyari. Photos from the Land Cruiser event in Germany, featuring modified vehicles and a group photo of Land Cruiser owners and enthusiasts. Recognizing the significance of this event, we decided to create a strong presence that users could remember. I took the lead in designing a logo and marketing materials, working closely with stakeholders to create one that aligned seamlessly with both KINTO and Toyota's brand identities. A collection of promotional materials for the Toyota Community by KINTO, featuring flyers promoting the virtual garage and community benefits. This combination of user feedback and last-minute branding helped refine both our product’s direction and its visual identity , giving us a clearer path forward. 🔄 Phase 2 - Refinement Adoption and Scaling While we haven’t yet had the resources to test the MVP with a larger audience, insights gained from distributor feedback and the Land Cruiser event have guided ongoing improvements to the platform . These refinements are paving the way for the next stages of development. The Toyota Community by KINTO platform is being positioned as a scalable solution, showcasing its potential to address challenges tied to a fragmented value chain while offering the flexibility to adapt to diverse regional markets. What is Next We remain focused on fine-tuning the platform for further review and exploring opportunities to highlight its value. ✨ Conclusion A Scalable Future For Toyota's Engagement Toyota Community by KINTO offers an opportunity to reconnect and unify global engagement within a centralized hub. With additional resources, this platform has the potential to become a valuable asset, enhancing not only customer engagement but also fostering stronger brand loyalty opportunities. This project wasn’t just about UX design or feature-building . It has shown me that user-centered design’s true power lies in its ability to adapt , to both business goals and cultural needs. If you enjoyed this, feel free to explore my past articles, where I dive into topics like importance of diversity and inclusivity as well as the process of redesigning the TechBlog itself.
アバター
This article is the entry for Day 5 of the KINTO Technologies Advent Calendar 2024 Hi, I'm Ryomm, also known as "The Phantom Bot Craftsman", and I develop the My Route iOS app at KINTO Technologies. Here’s a quick tip from Manabyi . Although this article focuses on Slack CLI, it’s important to note that it operates using the Slack API internally. Consequently, the same issue can arise when interacting directly with the Slack API. Background While using Slack CLI, I encountered an issue where Block Kit messages failed to send under the following conditions: A form input contains a value in rich_text format. The value is stored in the DataStore as rich_text. Multiple rich_text entries are retrieved from the DataStore, concatenated, and formatted. A block is created and sent using postMessage However, the process failed with the error: parameter_validation_failed . ![Error](/assets/blog/authors/ryomm/2024-12-04-2/02.png =600x) Cause The error was triggered due to an invalid parameter during message transmission. Upon investigation, I found that the block_id values were duplicated in the block I was trying to send, as shown below. [ { "type": "rich_text", "block_id": "xdrwH", // <- this "elements": [ /* ... */ ] }, { "type": "rich_text", "block_id": "xdrwH", // <- this "elements": [ /* ... */ ] } ] It appears that the message can't be sent to Slack because the block_id in the message you're trying to send conflicts with an existing one. block_id block_id is a unique identifier for a block. The official documentation explains it as follows: A unique identifier for a block. If not specified, a block_id will be generated. You can use this block_id when you receive an interaction payload to identify the source of the action. Maximum length for this field is 255 characters. block_id should be unique for each message and each iteration of a message. If a message is updated, use a new block_id. https://api.slack.com/reference/block-kit/blocks If you create a block without specifying a block_id , a block_id will be automatically generated. This is mainly used for interactive elements, such as identifying which block a button was pressed on during user interactions. A block_id must be unique within a single message or across a sequence of repeated messages (i.e., a series of bidirectional interactions). Additionally, when a message is updated, a new block_id must be used. In this case, the block_id contained in the received rich_text was automatically generated. Additionally, since the rich_text input was received as separate messages during the initial process, block_id duplication was likely to occur. As a result, this issue occurred due to a conflict with the auto-generated block_id Here is a Haiku for this occasion. Thought it was safe, Auto-generated failed me, Block_id conflict This haiku reflects my shock and disbelief—I had believed that an auto-generated block_id, like a UUID, wouldn’t easily collide, yet it did. This is just my speculation, but it seems that Slack generates block_id based on the block’s content. To test this, I inputted the exact same text multiple times, and each time, the same block_id was generated. ![Error](/assets/blog/authors/ryomm/2024-12-04-2/03.png =600x) Type hoge in rich_text Each time, the generated block_id was RlmLN . { "type": "rich_text", "block_id": "RlmLN", "elements": [ { "type": "rich_text_section", "elements": [ { "text": "hoge", "type": "text" } ] } ] } Therefore, if there’s a possibility of identical inputs, it’s safe to assume that block_id collisions are just as likely. Solution When merging multiple blocks into a single message for Slack, each block_id within the message must be unique. The simplest solution is to delete the block_id , especially if interactivity isn’t needed. Since Slack will automatically generate a new block_id if one isn’t specified, you can remove it and proceed with sending the message. Here’s part of the formatting method: The delete operator is used to remove the block_id property from the object. // The items obtained from client.apps.datastore.query are included in the event argument // Reference: https://api.slack.com/methods/apps.datastore.query#examples function eventMessage(event) { // ... event.description.forEach((description) => { if (description.block_id) { delete description.block_id // 🐈❗️ } message.push(description) }) // ... } Now, you can send messages without worrying about block_id conflicts! However, removing block_id from the object is not the best approach if you need interactive functionality. In that case, it’s best to generate and assign a block_id on the application side before sending the message. Conclusion This was a story about block_id conflicts preventing messages from being sent!
アバター
This aricle is the entry for day 19 in the KINTO Technologies Advent Calendar 2024 🎅🎄 Hello, I am GOSEO, an iOS engineer in the Mobile App Development Group at KINTO Technologies. Currently, I’m working on the app Unlimited. Interestingly, 66% of the iOS developers working on Unlimited are from overseas, and I enjoy chatting with them in English every day. I am also a fan of Paradox games and am currently hooked on Crusader Kings Ⅲ. This article is the entry for day 19 in the [KINTO Technologies Advent Calendar 2024] Evaluating the migration from Google Maps to MapKit in the Unlimited App In recent years, improvements in Apple Map’s performance have sparked interest in transitioning from Google Maps to MapKit. This shift is expected to reduce usage fees and enhance app performance and user experience. In this article, I will walk you through the implementation process, the challenges we encountered, and the outcomes of migrating from Google Maps to MapKit in the Unlimited app. Evaluating the migration from Google Maps to MapKit and its Process 1. Rendering maps and creating gradient lines In the Unlimited app, gradient lines are rendered on Google Maps. To replicate this functionality in MapKit, we tested the use of MKGradientPolylineRenderer . By setting colors and specifying the start and end points using locations , we examined whether this implementation would work effectively. Additionally, I considered that this feature could be used in the future to dynamically change the line's color based on the user's speed exceeding the limit. func mapView(_ mapView: MKMapView, rendererFor overlay: MKOverlay) -> MKOverlayRenderer { if let polyline = overlay as? MKPolyline { let gradientRenderer = MKGradientPolylineRenderer(polyline: polyline) gradientRenderer.setColors( [Asset.Colors.primary.color, Asset.Colors.cdtRouteGradientLight.color], locations: [0.0, 1.0] ) gradientRenderer.lineWidth = 2.0 return gradientRenderer } return MKOverlayRenderer(overlay: overlay) } 2. Differences in tap detection In Google Maps, detecting taps on the map or markers is straightforward. However, MapKit does not offer a built-in API for this functionality. To detect taps on the map itself, we used UITapGestureRecognizer . For marker taps, we handled them using the didSelect and didDeselect methods. It was a bit challenging to figure out whether a tap was on the map or a marker, but we resolved this by checking if there was a marker at the tapped location. Challenge: Setting up custom gestures required extra effort, but we were able to confirm that it works as intended. let tapGestureRecognizer = UITapGestureRecognizer(target: context.coordinator, action: #selector(context.coordinator.handleMapTap(_:))) tapGestureRecognizer.delegate = context.coordinator mapView.addGestureRecognizer(tapGestureRecognizer) 3. Adding and managing markers On the Unlimited map, multiple types of markers often overlap. To handle this, we implemented a system using zPriority to display markers in order of importance. By reusing instances of the same marker image, we avoided generating separate instances for each marker, which improved performance. Challenge: The default tap animation wouldn’t go away... After much trial and error, we found a solution. Instead of adding an image directly to MKAnnotationView, we added a UIView as a subview of the annotation view, then added a UIImageView to the UIView, and finally set the image on the UIImageView . This effectively disabled the default animation. The solution was truly a stroke of genius from one of my teammates! 4. Slow rendering When we tested the map using driving data from Oita to Fukuoka, we found a bug where the line rendering couldn’t keep up during repeated zooming in and out after the map was displayed. The dataset contained 23,000 coordinate points, and rendering occurred every time the map view changed. This caused significant memory and CPU resource consumption during UI updates. Challenge: Rendering couldn’t keep up with large numbers of coordinate points. We tackled this by using the Ramer-Douglas-Peucker algorithm to simplify and reduce similar coordinate points. This allowed us to consolidate the data into a single polyline by simplifying and segmenting the polylines. // A function to interpolate UIColor. Returns the color interpolated between two colors based on the fraction value func interpolateColor(fraction: CGFloat) -> UIColor { // Retrieve the RGBA components of the start and end colors let fromComponents = Asset.Colors.primary.color.cgColor.components ?? [0, 0, 0, 1] let toComponents = Asset.Colors.cdtRouteGradientLight.color.cgColor.components ?? [0, 0, 0, 1] // Interpolate colors based on the fraction let red = fromComponents[0] + (toComponents[0] - fromComponents[0]) * fraction let green = fromComponents[1] + (toComponents[1] - fromComponents[1]) * fraction let blue = fromComponents[2] + (toComponents[2] - fromComponents[2]) * fraction return UIColor(red: red, green: green, blue: blue, alpha: 1) } // A function to generate polyline information from an array of coordinates func makePolylines(_ coordinates: [CLLocationCoordinate2D]) -> [PolylineInfo] { // Return an empty array If the coordinates array is empty guard !coordinates.isEmpty else { return [] } // Calculate the chunk size (at least the entire set as one chunk) let chunkSize = coordinates.count / 20 > 0 ? coordinates.count / 20 : coordinates.count var cumulativeDistance = 0.0 Let totalDistance = coordinates.totalDistance() // Calculate the total distance var previousEndColor: UIColor = Asset.Colors.primary.color var previousEndCoordinate: CLLocationCoordinate2D? var polylines: [PolylineInfo] = [] // Divide the coordinates into chunks and process each chunk let chunks = stride(from: 0, to: coordinates.count, by: chunkSize) .map { startIndex -> [CLLocationCoordinate2D] in // Retrieve the coordinates of the chunk and add the last coordinates from the previous chunk var chunk = Array(coordinates[startIndex..<min(startIndex + chunkSize, coordinates.count)]) if let lastCoordinate = previousEndCoordinate { chunk.insert(lastCoordinate, at: 0) } previousEndCoordinate = chunk.last return chunk } for chunk in chunks { Let chunkDistance = chunk.totalDistance() // Calculate the distance of the chunk Let startFraction = cumulativeDistance / totalDistance // Calculate the fraction for the start point cumulativeDistance += chunkDistance Let endFraction = cumulativeDistance / totalDistance // Calculate the fraction for the end point let startColor = previousEndColor let endColor = interpolateColor(fraction: CGFloat(endFraction)) // Calculate end color using interpolation previousEndColor = endColor // Simplify the polyline (reduce points while maintaining high accuracy) let simplified = PolylineSimplifier.simplifyPolyline(chunk, tolerance: 0.00001) let polyline = MKPolyline(coordinates: simplified, count: simplified.count) // Add the polyline information to the list polylines.append(PolylineInfo( polyline: polyline, startFraction: startFraction, endFraction: endFraction, startColor: startColor, endColor: endColor )) } return polylines } // A function to simplify coordinates (implements the Ramer-Douglas-Peucker algorithm) static func simplifyPolyline(_ coordinates: [CLLocationCoordinate2D], tolerance: Double) -> [CLLocationCoordinate2D] { // Return the coordinates as is if there are 2 or fewer points, or if the tolerance is less than 0 guard coordinates.count > 2 else { return coordinates } guard tolerance >= 0 else { return coordinates } var result: [CLLocationCoordinate2D] = [] var stack: [(startIndex: Int, endIndex: Int)] = [(0, coordinates.count - 1)] var include: [Bool] = Array(repeating: false, count: coordinates.count) include[0] = true include[coordinates.count - 1] = true // Process recursively using a stack while !stack.isEmpty { let (startIndex, endIndex) = stack.removeLast() let start = coordinates[startIndex] let end = coordinates[endIndex] var maxDistance: Double = 0 var currentIndex: Int? // Find the farthest point from the current line for index in (startIndex + 1)..<endIndex { let distance = perpendicularDistance(point: coordinates[index], lineStart: start, lineEnd: end) if distance > maxDistance { maxDistance = distance currentIndex = index } } // If the farthest point exceeds the tolerance, include it and subdivide further if let currentIndex, maxDistance > tolerance { include[currentIndex] = true stack.append((startIndex, currentIndex)) stack.append((currentIndex, endIndex)) } } // Add only the coordinates where include is true to the result for (index, shouldInclude) in include.enumerated() where shouldInclude { result.append(coordinates[index]) } return result } // A function to calculate the vertical distance between a point and a line private static func perpendicularDistance(point: CLLocationCoordinate2D, lineStart: CLLocationCoordinate2D, lineEnd: CLLocationCoordinate2D) -> Double { let x0 = point.latitude let y0 = point.longitude let x1 = lineStart.latitude let y1 = lineStart.longitude let x2 = lineEnd.latitude let y2 = lineEnd.longitude // Distance formula (distance between a point and a line in a 2D plane) let numerator = abs((y2 - y1) * x0 - (x2 - x1) * y0 + x2 * y1 - y2 * x1) let denominator = sqrt(pow(y2 - y1, 2) + pow(x2 - x1, 2)) // If the length of the line is 0, set the distance to 0 return denominator != 0 ? numerator / denominator : 0 } 5. Verification Results and Conclusions We confirmed that similar functionality implemented with Google Maps in the Unlimited app can also be achieved using MapKit. This makes migration from Google Maps to Mapkit feasible. Additionally, this migration is likely to lower usage fees. Through this research, the Unlimited iOS team as a whole has gained a deeper understanding of MapKit’s capabilities. Conclusion Looking ahead, we plan to continue development using MapKit for this project. We will keep striving for further improvements to deliver even better services!
アバター
This article is the entry for Day 12 of the KINTO Technologies Advent Calendar 2024 🎅🎄 Hi, I’m Nakanishi from the Manabi-no-Michi-no-Eki (Learning Roadside Station) team. This year, the Learning Roadside Station project was officially launched and structured as an organization. As part of our initiatives, we also run an in-house podcast, and for this year’s Advent Calendar, we’d like to share more about it. What is Manabi-no-Michi-no-Eki (The Learning Roadside Station) It's a project aimed at making the in-house study sessions, which are frequently held, more accessible and effective. The initiative is led by passionate volunteers within the company, with the goal of supporting study sessions and fostering a culture of knowledge sharing across the organization. Joint Study Group For the first episode of the Learning Roadside Station Podcast, we interviewed the organizing members Asahi-san, Kiyuno-san, and Rina-san about a joint study session held within the company. In this podcast, we discussed in detail the background and purpose of the study sessions, key aspects of their management, and future plans. Interview HOKA-san (interviewer): First, could you tell us how the Learning Roadside Station project started? HOKA-san: This project was launched to further promote a positive learning culture within the company. It all started when Kageyama-san and several others came together to create a system to support in-house study sessions. HOKA-san: Can you tell us more about the background and purpose of the Joint Study Group? Asahi-san: During the new employee orientation, I realized that there were few opportunities to learn about information related to other systems. That’s when I felt the need for a dedicated space to catch up on the latest information, which led to the launch of the Joint Study Group. HOKA-san: What are some of the key aspects you focus on when organizing study sessions? Rina-san: We aim to share the latest information about other systems and products to provide a broader range of knowledge. Additionally, to increase participation, we make sure to thoroughly prepare in advance, and the organizing team members themselves are actively involved. HOKA-san: What were the results and feedback from the first study session? Kiyuno-san: Initially, we expected around 34 participants, but in the end, we had about 80 attendees, including those who joined via Zoom. We received a lot of positive feedback, such as being able to connect with colleagues they don’t usually interact with and getting access to the latest updates on other products. HOKA-san: How do you plan to continue the study sessions, and what are your future initiatives? Asahi-san: Moving forward, we plan to create thematic discussion sessions and networking opportunities based on job roles, so that more employees can participate easily. Rina-san: We have already decided on the next speaker, and we are committed to continuing these sessions regularly. Additionally, we will work on enhancing information sharing both internally and externally. HOKA-san: What is the purpose of promoting the Learning Roadside Station Project outside the company? Rina-san: By sharing information externally, we hope to attract more participants both inside and outside the company. Through this, we aim to foster a stronger learning culture within the company. Hoka-san: Lastly, how do you feel after this interview? Asahi-san: I gained a new perspective by experiencing both the participant’s and the organizer’s viewpoints. Being on the organizing side made me realize the importance and significance of these events, and it strengthened my motivation to make them even better. Kiyuno-san: Until now, I had only attended via Zoom, but by taking on a role in the organizing team, I realized how valuable audience reactions and comments are. I would like to continue actively participating in these sessions in the future. Summary The goal of the Learning Roadside Station project is not just to provide a space for study sessions, but to enrich the company culture through interaction and knowledge sharing among employees. By continuously holding study sessions and enhancing communication efforts, we aim to create an environment where more employees can participate, learn from each other, and grow together. This will contribute to ultimately improve overall skills and foster a strong sense of unity within the company. In this article, we shared the details of the Joint Study Group, the background of its operations, and future plans. Please look forward to the next study session!
アバター
はじめに こんにちは、9月入社のMizukiです! 本記事では2024年8、9月入社のみなさまに、入社直後の感想をお伺いし、まとめてみました。 KINTOテクノロジーズに興味のある方、そして、今回参加下さったメンバーへの振り返りとして有益なコンテンツになればいいなと思います! K.W 自己紹介 グループコアシステム部  共有サービス開発G JP会員PFチーム所属のPdMです。 新卒で大手webサービス会社でPdMとして入社し、KINTOテクノロジーズ(KTC)は2社目となります。 所属チームの体制は? JP会員PFチームは、日本のKINTOのWEBサービスやモバイルアプリで利用される会員機能の開発をしております。 チームリーダー1名、PdM1名、エンジニア6名(協力会社さん含む)の、計8名 KTCへ入社したときの第一印象?ギャップはあった? 中途入社のみの構成で毎月のように新しい人が入ってくるため、新参者を受け入れてくれる雰囲気をすごく感じました。 ギャップとしては、自分自身初めての転職だったので入社前は貢献できるか不安でしたが、メンバーのサポートもあり思ったよりも溶け込めているかなと感じてます。 現場の雰囲気はどんな感じ? KTC自体まだ新しい会社なので、ルールや仕組みを作りながらの段階に思いますが、 その分、各メンバーは主体的で新しい開発手法や取り組みを積極的に取り入れる文化があると感じました。 ブログを書くことになってどう思った? ブログなどを通して発信することがほとんどなかったので、過去の入社エントリーも見ながらじっくり書きました。 MKさんからの質問:オフィスで使っているおすすめのグッズを教えてください! オフィスではなく、自宅で使用しているものですが昇降デスクはオススメです。 こちらを使ってます。 https://www.amazon.co.jp/dp/B08CXSV3RX Shiori ![alt text](/assets/blog/authors/mizuki/20250123/suda.png =250x) 自己紹介 IT/IS部、生成AI活用PJTでGenerative AI Engineerをしています。 メイン業務は、生成AIの活用推進です。業務としては、生成AI研修の実施、ユースケース開発、生成AIの技術支援をしております。 所属チームの体制は? 自分を含め5人のメンバーで、名古屋3名、神保町1名、室町に1名(自分)という体制です。 KTCへ入社したときの第一印象?ギャップはあった? 毎月テーマに沿った勉強会、定期的なトヨタグループの技術勉強会、社内研修を開催していたり、業務に関連する外部のセミナーに参加できたり、技術をキャッチアップしてアウトプットする環境が整備されていると感じました。 Tech Blogや登壇など会社が発信することに積極的で、後押しをしてくれる環境が整っており、入社4ヶ月目で社外向けのセミナーで、社内での生成AI活用の取り組みを登壇する機会を得ることができました。 入社して「面着」という言葉を知りました。これはトヨタ用語で「直接会ってコミュニケーションを取る」というニュアンスの言葉なのですが、対面とオンラインのコミュニケーションをバランスよく取り入れる面着文化が、コミュニケーション齟齬を低減し、迅速な意思決定や価値創出の促進に寄与していると感じました。 トヨタの内製開発組織ということで、KINTOなどのモビリティ製品だけではなく、ファイナンス系、MaaS系、車のシステム系技術支援、販売店のDX、生成AIの技術支援など業務領域が多岐に渡っており、様々な領域について知見を広げる機会があります。 意思決定が民主的だと感じることが多いです。例えば、オフィスのドリンクのフレーバーは春夏と秋冬で入れ替えをしているのですが、入れ替え時に「好みのフレーバーアンケート」が実施され、きめ細かい体制にとても驚きました。)現在の秋冬フレーバーはアンケートの結果に基づくフレーバーです ) 現場の雰囲気はどんな感じ? Slackでもオフラインでも「ちょっと困った」を聞いた時に、丁寧にフォローしていただいており、とても温かい環境だと感じています。 ランチなども気軽にお誘いしやすく、部署以外のメンバーとも仲良くなる機会があり、とても楽しいです。 ブログを書くことになってどう思った?  ✌️😆✌️ K.Wからの質問:座右の銘は何かありますか。 座右の銘と言えるかはわかりませんが、2024年・2025年の努力テーマはあります。"Pivot" & "思い立ったが吉日"の2つです。生成AIの領域は、技術や情報の更新スピードがとても早いです。大々的な技術発表が定期的に発生し、「今日から昨日とは違った世界観が当たり前になる」ことが多々あります。そのような状況では、うまくいくorいかないかがわからない中、思い立ったことをすぐに検証して効果測定して、軌道修正して実行するというようなことを、素早いサイクルで回すことが必要になってきます。そのため、業務のちょっとしたことから日常生活まで、「思い立つ→行動する」までの時間を短縮できるよう心がけています。 ゆっき〜 ![alt text](/assets/blog/authors/mizuki/20250123/yukki-.jpeg =250x) 自己紹介 はじめまして、ゆっき〜です。IT/IS部 コーポレートITグループに所属しています。 社内情報システムをはじめ、ネットワークインフラやセキュリティなど幅広い分野で課題を探して解消に向けて動く業務をしております。 所属チームの体制は? チームはBoostチームに所属しており、メンバーは2名です。 コーポレートITグループの課題解決や後方支援を目的としたチームで、それぞれのメンバーが得意領域において課題解決に日々励んでおります。 現在はKINTOの各種システム入れ替えに伴うインフラ設計などに携わり、守破離を意識しつつ改善活動を進めております。 ※守破離とは、まずは既存の方法を守り(守)、次にそれを破り(破)、最終的には新しい方法を離れて創造する(離)プロセスです。 KTCへ入社したときの第一印象?ギャップはあった? 入社前はスタートアップとしてスピード感もって動いているとはいえ、大企業の傘下のため、ある程度制約があるものだと思っていました。 入社してみると、かなりの自由度や裁量が高く、ベンチャー気質のあるスピード感を持った会社だと感じました。 一方で、守るべき箇所はしっかりとガバナンスが効いており、スピード感と安定性が両立できているなと感じました。 また、在宅については、基本原則として週2回と定められてはいるものの、家庭の事情などによりやむを得ない場合などは柔軟に対応してくださることは感謝しています。 現場の雰囲気はどんな感じ? 思ったよりも在宅勤務している方も多く、同じ座席列に自分しか居ないという日もあり驚きました。 周りが在宅勤務を積極的に活用しているため、在宅勤務しやすい環境だなと感じました。 とはいえ、常設されたビデオ会議システムやSlack等のIT設備が整っているため、気軽に話しかけられる環境は整っており、安心できる環境です。 上長は声を掛けるとすぐに相談にのってくださるのも非常に助かっています。 また、グループの取り組みではあるものの、週1回30分の雑談タイムがあり、東名阪と物理的に距離があるメンバーとも気軽に雑談できる機会があるのもとても良いと感じています。 ブログを書くことになってどう思った? 同期入社とはいえ、物理的な距離があったりとなかなか話す機会がなかったですが、この入社者エントリーをきっかけに同期入社の多拠点の方たちとも接点を持てる良い機会だと思いました。 また、KTC入社前までは役割的にあまり積極的に情報発信ができない立場だったのですが、今後は少しずつアウトプットの機会を増やせたらなと思います。 Shioriさんからの質問:カメラが趣味と伺いましたが、上手くなる・撮影のコツがあれば教えていただきたいです。 まだまだ練習中の身ではあるのですが、最近意識していることは、遠景・中景・近景のバランスや画面内の要素の配置バランスなどを意識して取り組んでいます。あとは季節ごとにいろんな被写体を探しに行くようにして、撮影機会を増やすようにしています。 Mizuki 自己紹介 開発支援部、人事G 採用チームに所属しております。 エンジニア採用や、大阪での採用イベント運営などもしております! 所属チームの体制は? 採用チームは6名です KTCへ入社したときの第一印象?ギャップはあった? ギャップは特になかったです。 もともと、採用エージェントとして企業担当をさせていただいていたので、雰囲気やカルチャーなどについても理解ができておりました!ですが、想像以上に社員一人一人に裁量を任されているんだなと感じております 現場の雰囲気はどんな感じ? 人事チームとしてとても大切にしているバリューの一つでもある「全てにリスペクトを」という点を、体現できているチームだなと感じています。 12月までOsaka tech labで働いていたのですが、拠点として”かなり”アットホームな雰囲気で衝撃でした。Osaka tech labがこのままの雰囲気でずっといれたらいいなと思っています。 ブログを書くことになってどう思った? 私が拠点が離れていることもあり、オリエンテーションもほとんどの方と顔を合わせるだけだったので、今回のブログで皆さんの趣味とか仕事内容が知ることができていい機会だなと思いました。 ゆっき〜さんからの質問:一番おすすめのつけ麺屋はどこですか? 拠点ごとに書いてみました!ぜひ食べに行ってみてください! 東京だと道玄坂マンモス(渋谷) https://tabelog.com/tokyo/A1303/A130301/13122700/ 大阪だと宮田麺児(心斎橋) https://miyata-menji.com 名古屋だと半蔵(藤が丘) https://www.menya-hanzo.com/hanzo.html R.K 自己紹介 モビリティプロダクト開発部 マーケティングプロダクトG に所属しています。 今は「Prism Japan」というお出かけアプリとwebサイトのPdMをしています。 1社目は人材系の会社でアプリ・webサイトのグロースを担当し、KTCが2社目です。 所属チームの体制は? チームリーダー兼エンジニア1名、エンジニア2名、マーケター1名、ライター1名、PdM1名です。 KTCへ入社したときの第一印象?ギャップはあった? 仲の良いチームとお伺いしていましたが、実際に入社してみてもその通りで、私自身初めての転職で力になれるか不安でしたが、チームの皆さんから手厚いサポートをいただき、なんとか頑張れています! 現場の雰囲気はどんな感じ? 成長期のプロダクトを担当しているため、チームメンバー全員が自分の役割からにじみ出て和気あいあいと仕事をしています! ブログを書くことになってどう思った? あまりブログを書く機会がなかったので、他の皆さんのコメントを参考にさせていただきました! Mizukiさんからの質問:今まで一番行ってよかった海外はどこですか? 定番を除くと、ニュージーランドがよかったです!南島をロードトリップしていろいろなスポットを巡ったのですが、自然が豊かで壮大で、ごはんがおいしくて、街がきれいで、時間がゆったりしていて、期待以上な旅になり、いつかまた行きたいなと思っています! 陳 自己紹介 中国厦門(アモイ)出身で、中国でインターネット業界のエンジニアとして約10年間働いてきました。KTC は初めて勤めた日本企業です。 現在はプラットフォーム開発部DBREに所属し、DB周りの運用ツールを開発しています。 MBTIタイプはINFPです。 趣味はとにかく新しいものにチャレンジすることです。 たった3ヶ月で神保町周辺ほぼ全ての飲食店に行きました(半分はカレーですが)。 東京の美術館、特に現代アートを巡るのが好きです(理解できないのは面白さ)。 新発売に弱い。 所属チームの体制は? 自分を含め5名が神保町に勤務しており、最近はSREと合併したため、合計7名です。 KTCへ入社したときの第一印象?ギャップはあった? 第一印象はオフィス見た目派手ではないが、伝統な日本企業でもない感じでした(外資系な雰囲気はある)。 オリエンでは、使う内部ツールや初期設定の情報量が多く、IT経験者である自分も慣れるのに時間がかかりました。 神保町オフィスは人が少ないから、静かで集中できる環境です。 多くが在宅勤務ですからと聞いて良いと感じました(中国では在宅勤務の会社が少ないため)。 現場の雰囲気はどんな感じ? チームメンバーはとても親切、真面目で、社内のクライアント(プロダクト側の方)やチームメンバーに細かく対応する態度は珍しいですが、DBREではそれが実際にあります。 オンボーディングでは、バックグラウンド(業務に当たっての知識)が不足していることで無視されることはなく、「質問があれば遠慮なく言ってください」と常に言われています。 また、新人期間の「お手並み拝見」をされたこともありません。 具体的な課題は、みんなで相談した後に開発に入ります。これまで独りで問題解決にしていたので最初は驚きましたが、慣れるとこのペースが心地よく感じます。他のメンバーの分析から学ぶことが多く、「全員合意した」意識を持つなら取り組む自信が倍増になります。 ブログを書くことになってどう思った? 自分のアウトプット能力を向上させたいと思っているので、これからも挑戦してみようと思います。 チームメンバーのテックブログを拝見し、自分が欠けているのはそのような整理能力だと感じました。 R.Kさんからの質問:神保町周辺のほぼすべての飲食店に行かれたとのことで、その中でもお気に入りのランチを教えてください! ありがとうございます!参考として、個人の好みは東南アジア料理です(酸味と甘みがあって、食材も豊富の方が)。 【タイ料理】 バンコック コスモ食堂が一番のおすすめです。夜でも価格がリーズナブルで、ボリュームも満点。どのメニューを選んでも間違いないです。 【ラーメン】 正直、「ラーメン二郎」は毎回3時間以上待ちで並べないんです...😅 その代わり「用心棒」がおすすめです。豚山よりも個人的に好きですね。あと「海老丸らーめん」も外せません。スープの旨みが本当に素晴らしいんです。 【韓国料理】 「京城園」食べた後で「孤独のグルメ」でも紹介された店だと知って驚きました。コンタンスープの定食が美味しくで、値段もリーズナブルです。 【インド料理】 1階にある「ナマステインディア」で十分美味しいと思います(実は個人的にインド料理店はあまり味の違いを感じないんです...😂) 最近、銀座で「グルガオン」というお店を見つけました。そこのチーズグルチャが美味しくて、わざわざ行く価値があると思います 【カレー】 以下の3店がおすすめです: パンチマハル(ラッシーが絶品です) タケウチ 神保町本店(注意:かなり辛いメニューもあります) チャントーヤ ココナッツカリー(ココナッツベースのスープカレーが印象的) T.A. ![alt text](/assets/blog/authors/mizuki/20250123/ki2.jpeg =250x) 自己紹介 IT/IS部 セキュリティプライバシーG所属 情報セキュリティの担当をしています。 趣味はソロ活動です。 所属チームの体制は? チームリーダー、委託先の方を含めて3名です。 ちょうど今が繁忙期なので、ヒーヒー言っております。 KTCへ入社したときの第一印象?ギャップはあった? 柔軟性とスピード感がある会社だなぁと思いました。 国籍やバックグラウンドが異なる方が多いので、雑談するだけでインプットできることが多いです。 現場の雰囲気はどんな感じ? 自日々改善という雰囲気があり、実際に毎週何かがアップグレードされている感覚もあるので、好奇心が刺激されます。 いかに効率的にやるかを考えることが多いので、自分自身もブラッシュアップされている感じがします。 ブログを書くことになってどう思った? 誤字脱字しないように気を付けなければ…。 陳さんからの質問:ソロ活動について具体的にどのようなことをしていますか 一人飲みや一人映画、ソロキャンプなどはよくやります。最近は一人でクリスマスマーケットに行きました。イルミネーションがきれいでした。 MK 自己紹介 モビリティプロダクト開発部 DX開発G 所属 フロントエンドエンジニア 販売店様の業務DX化をサポートするプロダクトの開発を行っています。 営業 → 事務 → フロントエンジニアという職歴です。 生ハムが好きです。 所属チームの体制は? 9人のチームで複数プロダクトを構築しています。 様々な技術を扱えるエンジニアばかりなので、領域を定めずプロジェクトに合わせてコーディングやプランニングを行っています。 KTCへ入社したときの第一印象?ギャップはあった? 入る前は正直お堅いのかな…?と思っていたのですが、入ってみるとかなり柔軟な会社でした。 TOYOTAの安定感と、まだ新しい会社の良いとこ取りという印象です。 現場の雰囲気はどんな感じ? 技術への想いがアツいです。 プロダクトへの想いもアツいです。 コミュニケーションも活発で、活気があります。 ブログを書くことになってどう思った? 実際に開発しているメンバーの声は外部の方にとって良い情報源になりそうだな、と思いました! エントリーの皆さんの会社の印象を読めるのも楽しみです。 T.A.さんからの質問:子供のころに信じていた(騙されていた)迷信や嘘はありますか? 鳩の首の紫の部分に毒がある さいごに みなさま、入社後の感想を教えてくださり、ありがとうございました! KINTO テクノロジーズでは日々、新たなメンバーが増えています! 今後もいろんな部署のいろんな方々の入社エントリが増えていきますので、楽しみにしていただけましたら幸いです。 そして、KINTO テクノロジーズでは、まだまださまざまな部署・職種で一緒に働ける仲間を募集しています! 詳しくは こちら からご確認ください!
アバター
This article is the entry for day 13 of the KINTO Technologies Advent Calendar 2024 🎅🎄 Hello! My name is fabt and I am developing an app called Unlimited (its Android version) at KINTO Technologies. I have recently implemented a screenshot test, which has been much talked about, so I will introduce the steps of it, the stumbling blocks I encountered and how to solve them. I hope this will be helpful for those who are considering implementing a screenshot test in the near future. What is Screenshot Testing? Screenshot testing is a method where the app takes screenshots based on the current source code in development and compares them to previous versions to confirm and verify changes. Roughly speaking, it can even detect a 1dp difference that is hard for the human eye to notice, making it easier to ensure no unintended UI modifications slip through. Many conference sessions, including one at DroidKaigi 2024, have addressed this topic, and I sense it's gaining significant attention lately. Screenshot Test Library Selection I decided to use Roborazzi , a tool featured in numerous conference presentations and already implemented in several Android apps within our company. Trying it out Following the official setup guide and information on the Internet, I'll try to run it in a local environment. Since I'm not doing anything special at this stage, I'll move forward quickly. First, install the necessary libraries for running the screenshot test. Version Catalog (libs.versions.toml) [versions] robolectric = "4.13" roborazzi = "1.29.0" [libraries] androidx-compose-ui-test-junit4 = { module = "androidx.compose.ui:ui-test-junit4" } androidx-compose-ui-test-manifest = { module = "androidx.compose.ui:ui-test-manifest" } robolectric = { module = "org.robolectric:robolectric", version.ref = "robolectric" } roborazzi = { module = "io.github.takahirom.roborazzi:roborazzi", version.ref = "roborazzi" } roborazzi-compose = { module = "io.github.takahirom.roborazzi:roborazzi-compose", version.ref = "roborazzi" } roborazzi-junit-rule = { module = "io.github.takahirom.roborazzi:roborazzi-junit-rule", version.ref = "roborazzi" } [plugins] roborazzi = { id = "io.github.takahirom.roborazzi", version.ref = "roborazzi" } root build.gradle.kts file plugins { alias(libs.plugins.roborazzi) version libs.versions.roborazzi apply false } module build.gradle.kts file plugins { alias(libs.plugins.roborazzi) } android { testOptions { unitTests { isIncludeAndroidResources = true all { it.systemProperties["robolectric.pixelCopyRenderMode"] = "hardware" } } } } dependencies { // robolectric testImplementation(libs.androidx.compose.ui.test.junit4) debugImplementation(libs.androidx.compose.ui.test.manifest) testImplementation(libs.robolectric) // roborazzi testImplementation(libs.roborazzi) testImplementation(libs.roborazzi.compose) testImplementation(libs.roborazzi.junit.rule) } You should now have successfully added the necessary libraries. Now, let's create a test class like the one below and run it locally! import androidx.compose.material3.MaterialTheme import androidx.compose.material3.Surface import androidx.compose.material3.Text import androidx.compose.ui.test.junit4.createComposeRule import androidx.compose.ui.test.onRoot import androidx.test.ext.junit.runners.AndroidJUnit4 import com.github.takahirom.roborazzi.DEFAULT_ROBORAZZI_OUTPUT_DIR_PATH import com.github.takahirom.roborazzi.captureRoboImage import org.junit.Rule import org.junit.Test import org.junit.runner.RunWith import org.robolectric.annotation.GraphicsMode @RunWith(AndroidJUnit4::class) @GraphicsMode(GraphicsMode.Mode.NATIVE) class ScreenShotTestSample { @get:Rule val composeTestRule = createComposeRule() @Test fun sample() { composeTestRule.apply { setContent { MaterialTheme { Surface { Text(text = "screen shot test sample") } } } onRoot().captureRoboImage( filePath = "$DEFAULT_ROBORAZZI_OUTPUT_DIR_PATH/sample.png" ) } } } By running the following screenshot capture command in the Android Studio terminal, the Text(text = XXXX) component's screenshot should be output as a PNG file. ./gradlew recordRoborazziDebug ![](/assets/blog/authors/f.tsuji/2024-12-13/folder_empty_roborazzi.png =750x) Surprisingly, there was no output. This was an unexpected result. Test Gets Skipped Although the command execution was successful, for some reason the screenshot result was not output. While investigating, I ran the unit test locally and noticed that the following was displayed: ![](/assets/blog/authors/f.tsuji/2024-12-13/Test_events_were_not_received.png =750x) For some reason, the test events were not received, causing the test to be skipped. If that is the case, it certainly makes sense that the command was successful but there was no output. After further investigation, it seemed that the problem was a conflict in JUnit versions. Our app typically uses Kotest for unit testing, which runs on JUnit 5, while Roborazzi (or more specifically, Robolectric) uses JUnit 4 as its test runner. Surprisingly, there were similar issues in Roborazzi's github issues. The solution was to use the junit-vintage-engine , a library as mentioned in the above issue. Briefly speaking, this library allows JUnit 4 and JUnit 5 to coexist and is sometimes used for migration. Now, let's add junit-vintage-engine , run the command again to see if it works. First, add dependencies (the parts introduced in the previous section are omitted) Version Catalog (libs.versions.toml) [versions] junit-vintage-engine = "5.11.2" [libraries] junit-vintage-engine = { module = "org.junit.vintage:junit-vintage-engine", version.ref = "junit-vintage-engine" } module build.gradle.kts file dependencies { testImplementation(libs.junit.vintage.engine) } Run the screenshot save command again. This time, the Text(text = XXXX) screenshot should finally be output as a PNG file. ![](/assets/blog/authors/f.tsuji/2024-12-13/build_failure.png =750x) Execution result :::message alert After using junit-vintage-engine to allow JUnit versions to coexist, the test was able to run, but the result was a failure. ::: Initialization Failure It seems that the execution failed. Before adding the library, it couldn't even be executed, so let's take it as a positive step forward and aim to solve it honestly. It is a process of trial and error. As the first step in troubleshooting, I checked the logs and found the following output. ![](/assets/blog/authors/f.tsuji/2024-12-13/build_failure_log.png =750x) It seems like something failed during initialization. When I check the output logs, it seemed that it was trying to initialize our app's Application class. After a closer look at the Robolectric configuration , I found the following statement: Robolectric will attempt to create an instance of your Application class as specified in the AndroidManifest. Robolectric appears to create an instance of the Application class specified in AndroidManifest.xml . This matches what I saw in the logs! To prevent initialization failures, I'll follow the official documentation and use the @Config annotation to specify a simple Application class. import android.app.Application import org.robolectric.annotation.Config @RunWith(AndroidJUnit4::class) @Config(application = Application::class) // Add @GraphicsMode(GraphicsMode.Mode.NATIVE) class ScreenShotTestSample { With high expectations, run the screenshot save command (omitted) from the Android Studio terminal. Surely, this time the screenshot of Text(text = XXXX) should be output as a png file. ![](/assets/blog/authors/f.tsuji/2024-12-13/build_success.png =250x) ![](/assets/blog/authors/f.tsuji/2024-12-13/folder_roborazzi.png =750x) ![](/assets/blog/authors/f.tsuji/2024-12-13/result_sample.png =250x) A screenshot successfully generated! :::message After investigating Robolectric Configuration and specifying the Application class using the @Config annotation, the screenshot test finally succeeded. ::: To try out the comparison (screenshot test), I made a slight change to the test class and ran the command. // Text(text = "screen shot test sample") Text(text = "compare sample") // Change the text ./gradlew compareRoborazziDebug ![](/assets/blog/authors/f.tsuji/2024-12-13/folder_compare_roborazzi.png =750x) The comparison result has been successfully generated! ![](/assets/blog/authors/f.tsuji/2024-12-13/result_sample_actual.png =250x) Comparison image ![](/assets/blog/authors/f.tsuji/2024-12-13/result_sample_compare.png =750x) Comparison result Summary of Success So Far I created a simple test class and was able to run a screenshot test without any problems. While adjusting the details of unit tests and property specifications can be challenging depending on the project, the ability to visually confirm changes makes it clear and fast. Therefore, if possible, implementing screenshot testing can help maintain higher app quality by making unintended UI changes easier to detect. Side Note At this time, we also implemented CI for our app. Using the GitHub Actions Workflow, we achieved the point of saving, comparing, and commenting screenshots on a pull request. I'll omit the details since we didn't do anything particularly technical, but we have chosen a method that stores the results using the companion branch approach as described in official documentation . (We have integrated yml files, etc., but only at a general level of optimization for smoother implementation.) :::message When running ./gradlew recordRoborazziDebug , any regular unit tests within the same module are also executed. To address this in our app, we define our own properties and separate the Gradle tasks for running unit tests and screenshot testing tasks using Roborazzi. Reference issues https://github.com/android/nowinandroid/issues/911 https://github.com/takahirom/roborazzi/issues/36 ::: Let's get back to the topic. Including Preview Functions in Testing When implementing Composable functions, it's common to create corresponding Preview functions as well. If you implement tests manually as in the first half of this article, Composable Implementation Preview implementation Test Implementation It's quite a hassle... So, let's make the Preview function the target of screenshot testing! That's exactly what this section is about! This is another technique that has been a hot topic. Anyway, try it out The procedure is generally as follows: Collect Preview functions Run screenshot tests on the collected functions It's simple. There are several ways to collect Preview functions, but this time, I'll use ComposablePreviewScanner . The main reason I chose this is that it has well-documented setup steps, making investigation easier. Additionally, it seems likely to be officially integrated and supported in the future, though it's still experimental at the time of writing . Now, let's install the necessary libraries. Version Catalog (libs.versions.toml) [versions] composable-preview-scanner = "0.3.2" [libraries] composable-preview-scanner = { module = "io.github.sergio-sastre.ComposablePreviewScanner:android", version.ref = "composable-preview-scanner" } module build.gradle.kts file dependencies { // screenshot testing(Composable Preview) testImplementation(libs.composable.preview.scanner) } Next, let's create a test class by referring to the README of composable-preview-scanner and insights from those who have used it before. import android.app.Application import androidx.compose.ui.test.junit4.createComposeRule import androidx.compose.ui.test.onRoot import com.github.takahirom.roborazzi.DEFAULT_ROBORAZZI_OUTPUT_DIR_PATH import com.github.takahirom.roborazzi.captureRoboImage import com.kinto.unlimited.ui.compose.preview.annotation.DialogPreview import org.junit.Rule import org.junit.Test import org.junit.runner.RunWith import org.robolectric.ParameterizedRobolectricTestRunner import org.robolectric.annotation.Config import org.robolectric.annotation.GraphicsMode import sergio.sastre.composable.preview.scanner.android.AndroidComposablePreviewScanner import sergio.sastre.composable.preview.scanner.android.AndroidPreviewInfo import sergio.sastre.composable.preview.scanner.android.screenshotid.AndroidPreviewScreenshotIdBuilder import sergio.sastre.composable.preview.scanner.core.preview.ComposablePreview @RunWith(ParameterizedRobolectricTestRunner::class) class ComposePreviewTest( private val preview: ComposablePreview<AndroidPreviewInfo> ) { @get:Rule val composeTestRule = createComposeRule() @Config(application = Application::class) @GraphicsMode(GraphicsMode.Mode.NATIVE) @Test fun snapshot() { val fileName = AndroidPreviewScreenshotIdBuilder(preview).ignoreClassName().build() val filePath = "$DEFAULT_ROBORAZZI_OUTPUT_DIR_PATH/$fileName.png" // Preview function name.png composeTestRule.apply { setContent { preview() } onRoot().captureRoboImage(filePath = filePath) } } companion object { private val cachedPreviews: List<ComposablePreview<AndroidPreviewInfo>> by lazy { AndroidComposablePreviewScanner() .scanPackageTrees( include = listOf("XXXX"), // Set up a package to look for Preview functions exclude = listOf() ) .includePrivatePreviews() // Include private Preview functions .getPreviews() } @JvmStatic @ParameterizedRobolectricTestRunner.Parameters fun values(): List<ComposablePreview<AndroidPreviewInfo>> = cachedPreviews } } Now, execute the screenshot save command. ...However, it took a lot of time and even after an hour, it showed no signs of finishing. With fewer than 200 Preview functions, I didn't expect it to take this long. As expected, it wasn't going to be that easy. :::message alert When I included Preview functions in the screenshot test, the test execution time became significantly longer. ::: Test Execution Taking Too Long After doing some research, I found similar cases. https://github.com/takahirom/roborazzi/issues/388 A single test took about 5 minutes to complete, but after removing CircularProgressIndicator , the execution speed significantly improved. Digging deeper into the issue discussion, it seems that it takes a long time to test a Composable that includes infinite animation. As a solution, one suggested setting mainClock.autoAdvance = false to stop automatic synchronization with Compose UI and manually advancing time instead. By manually controlling time, we can capture screenshots at any time, avoiding the impact of infinite animations. Since our app also uses CircularProgressIndicator , this is definitely worth trying! I'll implement it immediately. I added mainClock. XXXX to the test class. Pause time. Advance by 1,000 milliseconds (1 second). Capture the screenshot. Resume time. That's the flow. composeTestRule.apply { mainClock.autoAdvance = false // 1 setContent { preview() } mainClock.advanceTimeBy(1_000) // 2 onRoot().captureRoboImage(filePath = filePath) // 3 mainClock.autoAdvance = true // 4 } (On a separate note, there's a library called coil for loading images asynchronously, which may not be tested correctly either. ]( https://github.com/takahirom/roborazzi/issues/274)But since the solution seems to be the same, we should be able to handle these together.) Alright, execution time! ![](/assets/blog/authors/f.tsuji/2024-12-13/build_failure_time.png =500x) The test execution became much faster, but it failed. :::message alert While infinite animations were a bottleneck, manually controlling time during testing resolved the issue. Now the test runs quickly, but the result was still a failure. ::: Failure When Dialogs Are Included Although the execution speed has increased, there is no point if it fails. In situations like this, we can always rely on logs. ComposePreviewTest > [17] > snapshot[17] FAILED java.lang.AssertionError: fail to captureRoboImage Reason: Expected exactly '1' node but found '2' nodes that satisfy: (isRoot) Nodes found: 1) Node #534 at (l=0.0, t=0.0, r=0.0, b=0.0)px 2) Node #535 at (l=0.0, t=0.0, r=320.0, b=253.0)px Has 1 child at androidx.compose.ui.test.SemanticsNodeInteraction.fetchOneOrThrow(SemanticsNodeInteraction.kt:178) at androidx.compose.ui.test.SemanticsNodeInteraction.fetchOneOrThrow$default(SemanticsNodeInteraction.kt:150) at androidx.compose.ui.test.SemanticsNodeInteraction.fetchSemanticsNode(SemanticsNodeInteraction.kt:84) at com.github.takahirom.roborazzi.RoborazziKt.captureRoboImage(Roborazzi.kt:278) at com.github.takahirom.roborazzi.RoborazziKt.captureRoboImage(Roborazzi.kt:268) at com.github.takahirom.roborazzi.RoborazziKt.captureRoboImage$default(Roborazzi.kt:263) at com.kinto.unlimited.ui.compose.ComposePreviewTest.snapshot(ComposePreviewTest.kt:49) There were multiple logs like the one above. It said there were two nodes. Searching based on the log information, I found an interesting issue . As it's related to compose-multiplatform , the cause may be different, but it seems that the failure is occurring in sections where Dialog()Composable is being used. When I was wondering if I had to give up, I discovered that an experimental function has been added to capture images that include dialogs! Since it's experimental, using it for all tests might not be ideal, but if I can determine whether the test target is dialogs, it may succeed. So, I decided to create a custom annotation called DialogPreview() , applied it to the Previews that include dialogs, and modified the test class to retrieve and determine the information. annotation class DialogPreview() @OptIn(ExperimentalRoborazziApi::class) // Add @Config(application = Application::class) @GraphicsMode(GraphicsMode.Mode.NATIVE) @Test fun snapshot() { val isDialog = preview.getAnnotation<DialogPreview>() != null // Add composeTestRule.apply { mainClock.autoAdvance = false setContent { preview() } mainClock.advanceTimeBy(1_000) if (isDialog) { // Add captureScreenRoboImage(filePath = filePath) } else { onRoot().captureRoboImage(filePath = filePath) } mainClock.autoAdvance = true } } companion object { private val cachedPreviews: List<ComposablePreview<AndroidPreviewInfo>> by lazy { AndroidComposablePreviewScanner() .scanPackageTrees( include = listOf("・・・"), exclude = listOf() ) .includePrivatePreviews() .includeAnnotationInfoForAllOf(DialogPreview::class.java) // Add .getPreviews() } } Determine if the Preview has a DialogPreview annotation (not null) and use captureScreenRoboImage() for dialogs. Let's run it. ![](/assets/blog/authors/f.tsuji/2024-12-13/build_success.png =250x) ![](/assets/blog/authors/f.tsuji/2024-12-13/folder_roborazzi_preview.png =750x) Due to the nature of the Preview function, images and file names are masked. After loading the Preview functions, I successfully saved their screenshots! :::message Composable functions that include dialogs contain multiple nodes, making it difficult to correctly determine the root, which led to test failures. Although it's experimental, using a function that can capture dialogs allowed the tests to run successfully. ::: The screenshot comparison was already confirmed earlier, so everything seems to be working fine as it is. Thank you for sticking with me through this long process! Conclusion In this article, I walked through the process of implementing screenshot testing, highlighting the challenges I encountered, and how I solved them. At first, I thought it would be as simple as adding a library and running the tests, but things didn't work out. After a lot of trial and error, I finally got screenshot testing to work. The solutions, such as adding libraries or applying annotations, were relatively simple. However, finding the right information to reach those solutions turned out to be more challenging than I expected. I hope this article will be of some help to you.
アバター
Introduction Hello! I'm Nakatani ( @tksnakatani ) from the Common Service Development Group, where we plan and develop essential functions across web and mobile app services, including membership and payment platforms. In this article, I'll share a case of an Aurora MySQL deadlock that occurred in a production environment at a payment platform—one of those "close-call incidents" that many of us have likely encountered at least once. The Incident Leading to the Deadlock One day in 2024, we received an incident notification from our log monitoring system. When I reviewed the notification, I found that the following error log recorded during a credit card payment process. Deadlock found when trying to get lock; try restarting transaction Additionally, Slack alerted us to an inquiry from a product manager reporting that their credit card payment had failed. At that moment, I immediately sensed the gravity of the situation, and I still vividly remember breaking out into a cold sweat. Cause investigation Logic check The deadlock itself was resolved naturally after about 30 minutes. Upon further investigation, we discovered that a popular product had been released around the time the error occurred, leading to a high volume of purchase requests. A deadlock generally occurs when multiple transactions hold resources required by each other, causing a standstill. Given that load testing had been conducted to simulate such scenarios, it was puzzling what the deadlock still occurred. Initially, we couldn't pinpoint any specific processes that would cause resource contention, so it was theoretically difficult to identify the rootcause of the deadlock. Reproduction check Next, we tried to reproduce the requests to the API, both before and after the deadlock occurred, in a local environment. Using the same request parameters that caused the problem in the production environment, we sent the following two requests almost simultaneously using the curl command. As in the production environment, one request was successful, but the other resulted in a system error. Here's an example of the curl command we used: curl --location 'http://localhost:8080/payments/cards' \ --header 'Content-Type: application/json' \ --data '{ "amount": 10, "capture": false, "request_id": "ITEM-20240101-0000001" } curl --location 'http://localhost:8080/payments/cards' \ --header 'Content-Type: application/json' \ --data '{ "amount": 10, "capture": false, "request_id": "ITEM-20240101-0000002" } ::: The error log contained the following message: Deadlock found when trying to get lock; try restarting transaction was output. Being able to reproduce the problem provided a crucial clue for our investigation, and for the moment, we felt a sense of relief. SHOW ENGINE INNODB STATUS Additionally, we used the MySQL SHOW ENGINE INNODB STATUS command to check the status of the InnoDB storage engine. The SHOW ENGINE INNODB STATUS command provides comprehensive information about the state of the InnoDB storage engine. This information can provide clues to examine the lock and transaction details and identify the specific cause of the deadlock. mysql> set GLOBAL innodb_status_output=ON; mysql> set GLOBAL innodb_status_output_locks=ON; ・・・Again, sent two requests, using the curl command. mysql> SHOW ENGINE INNODB STATUS; The results back then are as follows: *Some portions have been excerpted and masked. ===================================== 2024-xx-xx 10:05:27 0x7fe300290700 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 2 seconds ----------------- BACKGROUND THREAD ----------------- srv_master_thread loops: 463 srv_active, 0 srv_shutdown, 7176 srv_idle srv_master_thread log flush and writes: 0 ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 318 OS WAIT ARRAY INFO: signal count 440 RW-shared spins 290, rounds 306, OS waits 16 RW-excl spins 1768, rounds 5746, OS waits 48 RW-sx spins 0, rounds 0, OS waits 0 Spin rounds per wait: 1.06 RW-shared, 3.25 RW-excl, 0.00 RW-sx ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2024-04-18 10:04:02 0x7fe3059a4700 *** (1) TRANSACTION: TRANSACTION 12085, ACTIVE 6 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 7 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 3 MySQL thread id 70, OS thread handle 140612935517952, query id 28138 192.168.65.1 user update insert into payments (.... *** (1) HOLDS THE LOCK(S): RECORD LOCKS space id 297 page no 5 n bits 248 index uq_payments_01 of table `payment`.`payments` trx id 12085 lock_mode X locks gap before rec Record lock, heap no 56 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 11; hex 6d65726368616e745f3031; asc merchant_01;; 1: len 7; hex 5041594d454e54; asc PAYMENT;; 2: len 30; hex 6276346c6178316736367175737868757676647963356737656c616a6466; asc bv4lax1g66qusxhuvvdyc5g7elajdf; (total 32 bytes); 3: len 30; hex 70615f666f79706161656c6a71666f663378746332366b6d61756c38676e; asc pa_foypaaeljqfof3xtc26kmaul8gn; (total 35 bytes); *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 297 page no 5 n bits 248 index uq_payments_01 of table `payment`.`payments` trx id 12085 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 56 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 11; hex 6d65726368616e745f3031; asc merchant_01;; 1: len 7; hex 5041594d454e54; asc PAYMENT;; 2: len 30; hex 6276346c6178316736367175737868757676647963356737656c616a6466; asc bv4lax1g66qusxhuvvdyc5g7elajdf; (total 32 bytes); 3: len 30; hex 70615f666f79706161656c6a71666f663378746332366b6d61756c38676e; asc pa_foypaaeljqfof3xtc26kmaul8gn; (total 35 bytes); *** (2) TRANSACTION: TRANSACTION 12084, ACTIVE 7 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 7 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 3 MySQL thread id 69, OS thread handle 140612935812864, query id 28139 192.168.65.1 user update insert into payments (.... *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 297 page no 5 n bits 248 index uq_payments_01 of table `payment`.`payments` trx id 12084 lock_mode X locks gap before rec Record lock, heap no 56 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 11; hex 6d65726368616e745f3031; asc merchant_01;; 1: len 7; hex 5041594d454e54; asc PAYMENT;; 2: len 30; hex 6276346c6178316736367175737868757676647963356737656c616a6466; asc bv4lax1g66qusxhuvvdyc5g7elajdf; (total 32 bytes); 3: len 30; hex 70615f666f79706161656c6a71666f663378746332366b6d61756c38676e; asc pa_foypaaeljqfof3xtc26kmaul8gn; (total 35 bytes); *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 297 page no 5 n bits 248 index uq_payments_01 of table `payment`.`payments` trx id 12084 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 56 PHYSICAL RECORD: n_fields 4; compact format; info bits 0 0: len 11; hex 6d65726368616e745f3031; asc merchant_01;; 1: len 7; hex 5041594d454e54; asc PAYMENT;; 2: len 30; hex 6276346c6178316736367175737868757676647963356737656c616a6466; asc bv4lax1g66qusxhuvvdyc5g7elajdf; (total 32 bytes); 3: len 30; hex 70615f666f79706161656c6a71666f663378746332366b6d61756c38676e; asc pa_foypaaeljqfof3xtc26kmaul8gn; (total 35 bytes); *** WE ROLL BACK TRANSACTION (2) ---------------------------- END OF INNODB MONITOR OUTPUT ============================ The above indicates the following: TRANSACTION 12085 and TRANSACTION 12084 exist. Both TRANSACTION 12085 and TRANSACTION 12084 acquired the same "gap lock.” TRANSACTION 12085 attempted to acquire an "insert intention gap lock" before insertion, but this conflicted with the gap lock of TRANSACTION 12084, causing it to wait. TRANSACTION 12084 attempted to acquire an "insert intention gap lock" before insertion, but this conflicted with the gap lock of TRANSACTION 12085, causing it to wait. MySQL detected a deadlock and rolled back TRANSACTION 12084. What are gap lock and insertion intention gap lock? Gap lock A gap lock is a lock on the gap between index records, or a lock on the gap before the first index record or after the last index record. For example, SELECT c1 FROM t WHERE c1 BETWEEN 10 and 20 FOR UPDATE; prevents other transactions from inserting a value of 15 into column t.c1, because the gaps between all existing values in the range are locked, regardless of whether such a value already exists in the column. https://dev.mysql.com/doc/refman/8.0/ja/innodb-locking.html#innodb-gap-locks Insertion intention gap lock An insertion intention lock is a type of gap lock set by an INSERT operation before a row is inserted. This lock indicates that the insertion is done intentionally so that multiple transactions inserted into the same index gap do not have to wait for each other unless they are inserted into the same place within that gap. Let's assume that there are index records whose values are 4 and 7. Separate transactions attempting to insert the values 5 and 6, respectively, use an insertion intention lock to lock the gap between 4 and 7 before acquiring an exclusive lock on the inserted rows, but do not block each other because the rows do not conflict. https://dev.mysql.com/doc/refman/8.0/ja/innodb-locking.html#innodb-insert-intention-locks Since it was discovered that the gap lock was the cause of the problem in question, we proceeded to identify the portion of the credit-card payment processing where the gap lock was acquired. The overall flow of the payment processing is divided into the following three steps: Checking whether a payment has been made with the same request ID Requesting payment from the payment service provider Writing the results from the payment service provider into the database and returning a response. We set breakpoints around the areas where SQL is issued and performed debugging in a local environment, confirming that a gap lock was acquired immediately after executing the following query: SELECT * FROM PAYMENTS where request_id = '' FOR UPDATE; The data from performance_schema.data_locks at that time is as follows: Cause All the information has been gathered, and the cause has been identified. In the payment platform, the request_id was received from the request source to check for duplicate requests, and this value was assigned a unique index for use in subsequent references. Meanwhile, on the product side, the request_id was generated according to the following rule: Product ID-YYYYMMDD-Sequential Number When the deadlock occurred, a popular product had been released, and purchase requests for the same product were concentrated and sent in large numbers over a short period of time. As a result, a large number of requests were sent, in which the sequential number portion of request_id was counted up rapidly. An example of a curl command curl --location 'http://localhost:8080/payments/cards' \ --header 'Content-Type: application/json' \ --data '{ "amount": 10, "capture": false, "request_id": "ITEM-20240101-0000001" } ::: As mentioned above, the main flow of credit card payment processing is as follows. Checking whether a payment has been made with the same request ID Requesting payment from the payment service provider Writing the results from the payment service provider into the database and returning a response. The problem occurred in the first step: SELECT * FROM PAYMENTS where request_id = '' FOR UPDATE; This query is typically executed under the assumption that requests with the same request_id will not be received. However, since the data for the request_id had not been inserted yet, the query resulted in no action, and a gap lock was acquired. Subsequently, in the third step, INSERT processing to write the results occurred, and it attempted to acquire an insertion intention gap lock. However, it conflicted with the gap lock that had already been acquired, resulting in a wait. As a result, MySQL detected a deadlock, and one transaction was rolled back. Resolution Through our investigation, we identified that the SELECT FROM ... FOR UPDATE query, which was used to check for duplicate payments, was causing the deadlock. To resolve this, we decided to discontinue this query and revised the design. Now, the data is temporarily registered when a request is received, and the transaction is committed immediately. Although there were concerns that increasing the frequency of commits might lead to performance degradation, load testing confirmed that the necessary performance could be maintained, so we released it with this specification. Reflection This incident highlighted our insufficient understanding of gap locks. We didn't fully understand that a gap lock is acquired when there are no results for a SELECT FROM ... FOR UPDATE query. While I believed I carefully reviewed manuals and incorporated the ideas into our designs, I now realize that I had assumed I knew everything. https://dev.mysql.com/doc/refman/8.0/ja/innodb-locking.html I also regret that there were aspects we could have noticed during testing. In our load testing, we used random values (UUIDs) for the request_id , primarily to avoid performance degradation from index fragmentation and rebuilding. As a result, no deadlock occurred during the tests, and they completed successfully. Conclusion A deep understanding of transaction behavior and lock mechanisms is essential when working with MySQL and the InnoDB storage engine. This incident reinforced the importance of regularly reviewing documentation and specifications, and seeking expert advice when necessary. Additionally, I learned that investigating the actual parameters used in the production environment, and conducting tests with request values based on those parameters, can help identify issues early and improve overall quality.
アバター
Introduction Hello! I am Akeda, a member of both the Corporate IT Group and the Technology Public Relations Group, and of the Manabi-no-Michi-no-Eki Team as well. (“Manabi-no-Michi-no-Eki” literally means “roadside station of learning.”) As a corporate engineer, I am usually doing things like on/offboarding related for IT equipment, and improving the processes and work within the groups. In this article, I would like to share the story of how I, a corporate engineer, happened to get involved in the Manabi-no-Michi-no-Eki project, and creating a portal that spontaneously collects videos from actively held in-house study sessions. The Trigger: Is There No Way to Watch Internal Study Sessions Later? As we have mentioned in several articles below, our company frequently holds in-house study sessions. https://blog.kinto-technologies.com/posts/2024-04-23_%E5%AD%A6%E3%81%B3%E3%81%AE%E9%81%93%E3%81%AE%E9%A7%85%E3%81%AF%E3%81%98%E3%82%81%E3%81%BE%E3%81%97%E3%81%9F/ https://blog.kinto-technologies.com/posts/2024-05-21-%E5%AD%A6%E3%81%B3%E3%81%AE%E9%81%93%E3%81%AE%E9%A7%85-iOS%E3%83%81%E3%83%BC%E3%83%A0%E3%81%AE%E5%8B%89%E5%BC%B7%E4%BC%9A%E3%81%AB%E7%AA%81%E6%92%83/ After the sessions, the video recordings of them get posted on the in-house Slack channel. However, as a corporate engineer, I felt that the following points were internal issues. There are places in the company like Confluence, SharePoint, and Box for sharing document files, but no fixed place to store video content. In particular, although the study sessions could be beneficial to everyone in company, the fact that the video recordings of them only got posted on the in-house Slack channel made them difficult to find later. The company-wide information-sharing channel is the main place that things get posted to and other work-related information and so on flows through it as well, so they were impossible to find through simple scrolling alone. Members who joined the company after study sessions had been held had no way of knowing that they had been held in the first place, let alone that there were videos and files from them. Wondering what could be done to solve these problems, I came up with the idea of building a video platform and creating a place where the videos of the study sessions could all be gathered together. The Beginning: I Tried Laying My Idea on the Manabi-no-Michi-no-Eki Project Members When I had the idea of creating a place for gathering together videos of the study sessions, I immediately thought of the Manabi-no-Michi-no-Eki project. Manabi-no-Michi-no-Eki is a “michi-no-eki” (roadside station) is an activity that "supports in-house activation centered on study groups where internal 'study groups' intersect." I selfishly thought that it would be nice if we could work together if what I want to do and their activities clashed, or if they were considering it. As the saying goes, "strike while the iron is hot," I started by approaching Kin-chan , who is in the same groups as me, and is also involved in Manabi-no-Michi-no-Eki. The response I got was an enthusiastic thumbs-up! (I remember being overjoyed about that.) After that, once the Corporate IT Group had gotten the green light to go ahead with the idea, I immediately went to discuss it with the other project members as well. I think the back-and-forth on Slack at the time conveys how enthusiastic the Manabi-no-Michi-no-Eki members were about it, so I will share it some of it with you here. First, here is the discussion message from me. Did you notice the amazing emoji reaction? Here is an answer I got with the same momentum. Things moved quickly! After that, on the same day that we all got together, I was warmly invited to join Manabi-Michi-no-Eki. And so, we ended up all proceeding with the video platform together as members of the Manabi-Michi-no-Eki team. Main Story: We Built a Video Platform and Rolled It Out Internally I imagine the internal rolling out of a platform containing collected videos of company-wide meetings and in-group study sessions is something some of this blog’s readers would like to do themselves, so now, I will talk about what we did to achieve it, and pick out a few things to talk about why we did them. What We Did Building a video platform Utilizing the SharePoint site that was being used in the company, we gave (among other things) the video posting method a makeover. Deciding the video collection method Collect videos of meetings and internal study sessions that are OK to publish within the company Get the meeting/session organizers to upload video files and publishable files to the documents section of the SharePoint site On the top page of the SharePoint site, use the “highlighted content” feature, and set a filter using the study session name. Once the setup has been done, if a video with the same study session name is uploaded, it will get posted on the top page automatically. Spreading the word, and running the system We announced the video platform and asked people to collect videos for it in the all-company meeting at the end of August Q. Why Did You Choose SharePoint? A. Because our company uses Microsoft365 for groupware. There are various video platforms to choose from, YouTube, Vimeo, and Brightcove being prominent examples. However, using one of those would entail signing a new contract, and in the first place, one of the requirements was that we wanted collecting the videos of the sessions to be done by their organizers. Consequently, we opted for SharePoint, which everyone in the company is already familiar with. Q. Why Did You Decide to Get the Organizers Themselves to Upload the Videos? A. We wanted the organizers themselves to handle things like editing out the unnecessary parts. We thought it would be best if they decided for themselves whether to upload things to the video platform. As third parties, we Manabi-no-Michi-no-Eki members could have done the work for them, but we thought the organizers themselves were the most suitable people to decide whether the videos should also include things like the casual chit-chat before the study sessions started and things that would only make sense if you had actually seen them at the time, and to decide whether the videos should be uploaded to the platform in the first place. Q. Why Did You Decide to Use the “Highlighted Content” Feature A. Because it was simple and easy. When we thought about how to get people to watch the uploaded videos, we presumed that the reach would be higher if people saw ones they were interested in via images rather than text, and could watch them on the top page rather than somewhere a few clicks away. After that, we created several patterns for how to access the videos, and got Manabi-no-Michi-no-Eki members to watch them. When we did this, we found that the format that would attract viewers’ interest the most was one where we could display thumbnails, and the videos would be grouped by study session and listed on the SharePoint site’s top page. The “highlighted content” feature lets you do all that, so that is what we decided to use. Example of using it Reference: https://support.microsoft.com/ja-jp/office/強調表示されたコンテンツの-web-パーツを使用する-e34199b0-ff1a-47fb-8f4d-dbcaed329efd A cautionary note about using this is that filtering by study session name (= video file name) might not catch videos that have only just been uploaded. When that happens, please wait a little while, then try again. Q. Did You Only Spread the Word about It That One Time at the All-Company Meeting? A. We tell people about it every month in the orientation for new employees. We tell people about it in the monthly Manabi-no-Michi-no-Eki and Technology Public Relations Group sections of the monthly all-company meetings. As I said at the beginning of the article, there was an issue in that members who joined the company after study sessions had been held had no way of knowing that they had been held in the first place, let alone that there were videos and files from them. Of course, we addressed that here. By having these two sections, we are successfully making its presence felt! In Conclusion: We Will Continue to Support Internal Learning in a Variety of Ways! Since the all-company meeting, the number of accesses to the platform has not decreased significantly and study session videos are being collected. This reinforces my belief that the platform is addressing the issues I initially identified. So far, I have talked about the following three themes: the internal issues I perceived; how I found a strong supporter within the company in the form of Manabi-no-Michi-no-Eki; and the main theme, how we actually built the platform. One of our company’s good points is that it is tolerant about people taking on new challenges. I feel that this initiative really embodies that, and although I talked at length before getting to the main theme, I do believe (perhaps presumptuously) that I have gotten that message across to everyone who has stuck with this article up to here. Currently, Manabi-no-Michi-no-Eki is working as a single team in the Technology Public Relations Group based on a single project. We will continue to serve as a “roadside station” where internal study sessions come together and activities are conducted to support internal energization centered on them, and to tell you about those activities through things like the Tech Blog. So, please do stay tuned!
アバター
技術広報Gの中西です。この度新たなコミュニティを立ち上げることになったのでこちらで告知致します。 モバイルアプリの市場は年々拡大し、サービスの差別化には迅速かつ高品質なリリースが求められるようになりました。そんな状況下で注目を集めるのが、Seleniumに近い操作感でiOS/Androidアプリをテストできる Appium です。ただし、OSのバージョン差や環境構築の難しさなど、実際に導入・運用するうえで乗り越えるべきハードルも存在します。 こうした課題を共有し、解決策を学び合う場として、新たなコミュニティ「 Appium Meetup Tokyo 」が発足しました。 「Appium Meetup Tokyo」を立ち上げ理由 ニーズの高まり モバイルアプリのリリースサイクルは短期化が進み、手作業だけで行うテストには限界があります。Appiumはテスト自動化の効率化を後押しする重要な選択肢ですが、設定や実装の知見が十分に共有されていない状況です。 また弊社KINTOテクノロジーズにおいても 2025年の重要施策 として「技術力」「開発生産性」「リリーススピード」を掲げております。これらを実現するうえでもAppiumによるテスト自動化はとても重要な取り組みとなっています。 近年「開発生産性」という言葉を聞かない日は無いくらい様々な企業で語られていますが、今後ますます開発速度が高まる中必然的に自動テストの重要度が増してきています。 国内向けの情報不足 英語ドキュメントや海外事例は増えてきていますが、日本語でまとまったAppium情報はまだ限られています。実務で運用するにあたっては、身近な事例や成功・失敗談が大きな助けになります。 おそらく、他社様も同様にお困りのQAエンジニアの皆さんやモバイルアプリ開発者の方がいらっしゃると思います。日本語での情報共有やコミュニティを活発化させ、よりより自動テスト環境を構築していきたいと考えております。 Autify社との協力体制 モバイルアプリ自動テストプラットフォームを提供するAutify社とお話している中で、Appiumを学ぶ機会があると嬉しい。という話をしていく中で、この度、協力しこのコミュニティをまずは小さな勉強会から始めていくこととなりました。 Autify社では、サービスの裏側でAppiumを利用しており、様々な知見をお持ちのため、今まで外に出てきていないような様々な知見、自社のアプリ開発や自動テスト化だけでは得られないような知見やノウハウも多くお持ちなので、私個人としてはとても楽しみにしております。 また、Appiumだけでなく周辺ツールも含めた総合的なノウハウを共有することで、より幅広い課題解決に対応できるコミュニティを目指します。 今後この活動を広めていくための運営メンバーも絶賛大募集中です。 コミュニティの活動内容 定期勉強会 Appium基礎講座 実機・シミュレータを使ったハンズオン CI/CD連携方法の解説 ライトニングトーク(LT)やディスカッション 導入企業の事例紹介 現場ならではの課題やノウハウの共有 日々のアップデート情報やTips交換 オンライン情報発信 スライドや資料のアーカイブ コミュニティSlackやSNSでのQA ベストプラクティス集の作成 などなど、皆様とご一緒に様々な取り組みを出来ればと考えております。 初回イベントについて 開催日程 : 2025年2月20日(木) 19:00-21:30 開催場所 : Autify社 東京オフィス(ハイブリット開催) 主なテーマ : AutifyにおけるAppiumプラグインの活用事例(15分+質疑応答5分) 効率的なアプリ自動化のためのガイドラインと実践方法(15分+質疑応答5分) 【公募LT】発表3(15分+質疑応答5分) 【公募LT】発表4(15分+質疑応答5分) 詳細はconnpassをご覧ください https://autifyjapan.connpass.com/event/342867/ 登壇ご希望の方、参加ご希望の方は是非↑からお申し込みください。 今後の展開 継続的なイベント開催 勉強会やワークショップを通じて、初心者から上級者まで多様なニーズに応えていきます。 他ツールとの連携情報 SeleniumやCypressなどの他ツールとの比較や連携事例も積極的に取り上げ、実務の選択肢を広げます。 コミュニティドリブンな情報発信・共有 イベントを重ねる中で、Appiumに関する成功談・失敗談を含めた豊富な知見をコミュニティ全体で知見を共有し、参加者のスキル向上に貢献します。 参加を検討している方へ モバイルアプリの自動テストをこれから本格的に導入したい方 Appiumに興味があるが具体的な事例やノウハウが欲しい方 CI/CDと組み合わせた運用に関心があるエンジニアやQA担当の方 他社事例を参考に自社のテスト文化を改善したい方 上記に当てはまる方は、ぜひAppium Meetup Tokyoで最新の知見を共有し合いましょう。今後の告知や詳細情報は @AutifyJapan や @KintoTech_Dev でさせて頂きます。ご質問やご要望などがございましたら、お気軽にお寄せください。 Appium Meetup Tokyoでお会いできることを心より楽しみにしています。
アバター
This article is the entry for day 9 in the KINTO Technologies Advent Calendar 2024 . Introduction Hello, I'm Kameyama, an application engineer in the Digital Transformation Development Group. In recent years, generative AI has been applied across different fields, and our development team is also working on building systems that makes the most of its capabilities. Furthermore, since Java is widely used within our development team, we believed it would allow us to efficiently build an interface for generative AI while taking advantage of our existing knowledge and tools. With this in mind, this article will explain how to call generative AI using Java and process its results. This article will cover the basic implementation of calling Azure OpenAI using Java, with simple code examples. Compared to OpenAI, Azure OpenAI is considered a platform with greater scalability and reliability, making it better suited for large-scale business systems. Setting Up Azure Open AI First, sign up for an Azure subscription: https://azure.microsoft.com/en-us/pricing/purchase-options/azure-account?icid=ai-services&azure-portal=true Next, follow the instructions on the page below to obtain the endpoint and API key. https://learn.microsoft.com/ja-jp/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython-new&pivots=programming-language-java Access the Azure console here . * This page requires logging in with the account you registered earlier. Setting Up the OpenAI Library To call Azure OpenAI, the Azure SDK library will be used. This SDK library allows for simple and efficient coding to call generative AI in Azure OpenAI. For Gradle: dependencies { implementation 'com.azure:azure-ai-openai:1.0.0-beta.12' implementation 'com.azure:azure-core-http-okhttp:1.7.8' } For Maven: <dependencies> <dependency> <groupId>com.azure</groupId> <artifactId>azure-ai-openai</artifactId> <version>1.0.0-beta.12</version> </dependency> <dependency> <groupId>com.azure</groupId> <artifactId>azure-core-http-okhttp</artifactId> <version>1.7.8</version> </dependency> </dependencies> The version used here is the latest available at the time of writing. Refer to the Official Documentation for the most up-to-date version. In particular, the Azure OpenAI client library for Java that we will use this time is currently in beta, so we recommend that you use the stable version when it is released in the future. Calling the Azure OpenAI Chat Model in Practice Reference: https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/openai/azure-ai-openai /src/main/resource/config.properties endpoint=https://{Resource Name}.openai.azure.com/ apiKey={API Key} Enter the obtained Azure OpenAI endpoint and API key here to manage them in a separate file. You can manage sensitive information according to your own or your team's policies. /src/main/java/com/sample/app/Main.java Package com.sample.app; // match your package name import com.azure.ai.openai.OpenAIClient; import com.azure.ai.openai.OpenAIClientBuilder; import com.azure.ai.openai.models.*; import com.azure.core.credential.AzureKeyCredential; import java.util.ArrayList; import java.util.List; public class Main { public static void main(String[] args) { // Load the properties file *Modify as needed if using a different method to manage key information. Properties = new Properties(); try (InputStream input = Main.class.getClassLoader().getResourceAsStream("config.properties")) { if (input == null) { System.out.println("config.properties File not found.") ; return; } properties.load(input); } catch (IOException ex) { System.out.println(ex.getMessage()); return; } // Retrieve configuration values from properties String endpoint = properties.getProperty("endpoint"); String apiKey = properties.getProperty("apiKey"); // Create OpenAI client var client = new OpenAIClientBuilder() .endpoint(endpoint) .credential(new AzureKeyCredential(apiKey)) .buildClient(); // Prepare the prompt List<ChatRequestMessage> messages = new ArrayList<>() .setTemperature(0.7) // Response randomness, the higher more diverse (0.0~2.0) .setMaxTokens(100) // Maximum number of response tokens .setFrequencyPenalty(0.0) // Penalty for frequently occurring words (-2.0~2.0) .setPresencePenalty(0.6); // Penalty for things relates to existing topics (-2.0~2.0) messages.add(new ChatRequestSystemMessage("You are an excellent AI assistant.")); messages.add(new ChatRequestUserMessage("For beginners, please explain the difference between classes and objects in Java.")); // Set the request option var options = new ChatCompletionsOptions(messages); var chatCompletions = client.getChatCompletions("gpt-4o", options); //Specify the deployment name or generative AI model name to use // Send the request and retrieve the result for (ChatChoice choice : chatCompletions.getChoices()) { ChatResponseMessage message = choice.getMessage(); System.out.printf("Index: %d, Chat Role: %s.%n", choice.getIndex(), message.getRole()); System.out.println("Message:"); System.out.println(message.getContent()); } } } When calling the generation AI, you can set various parameters. These are parameters that cannot currently be set in the ChatGPT application you normally use, and being able to adjust these parameters is one of the benefits of calling the generation AI from a program. In this case, we set four parameters (temperature, maxTokens, frequencyPenalty, and presencePenalty), but there are many other parameters. See here for more details. In addition, the following two types of messages were set in the messages section. The former ChatRequestSystemMessage can be executed without setting. ChatRequestSystemMessage This message is used to set the behavior and roles of the generative AI model, defining the tone of the conversation and response style. ChatRequestUserMessage This message is used to convey specific questions or instructions from the user to the AI, and the response is returned from OpenAI as the return value. For the first argument of getChatCompletions, enter a deployment name or a model name. The deployment name can be obtained from the Azure portal. If using OpenAI outside of Azure, enter a model name "gpt-4o" or "gpt-3.5-turbo," etc. (In the example above, the model name is entered.) .gitignore /src/main/resources/config.properties If managing it as explained in this article with /src/main/resource/config.properties, add the above line to .gitignore. Especially when managing it in a repository, be sure to handle sensitive information such as API keys with utmost care. Execution Result We were able to get the following response from OpenAI. (The actual response is in Markdown format.) Index: 0, Chat Role: assistant. Message: The difference between classes and objects in Java is an important concept for beginners learning programming. Here's an easy-to-understand explanation: Class Blueprint : You can think of a class as a blueprint or template for creating an object. It defines the attributes (fields) and behaviors (methods) of an object. Declaration : In Java, the class keyword is used to define a class. For example, a class representing a car can be defined as follows: public class Car { // Field (attribute) String color; int year; // Method (behavior) void drive() { System.out.println("The car is driving"); } } Object Instance : An object is an entity (instance) generated from a class. It holds specific data and can perform operations on that data. Creation : In Java, the new keyword is used to create an object from a class. For example, to create an object from the Car class: public class Main { public static void main(String[] args) { // Create an instance of the Car class Car myCar = new Car(); myCar.color = "Red"; myCar.year = 2020; // Call the object's method myCar.drive(); } } Summary A class is a template for creating objects, defining their attributes and behaviors. An object is an actual instance of the class, holding specific data and executing the defined behaviors. By understanding this basic relationship, you can begin to build more complex programs. Conclusion This article provided a basic introduction to using Azure OpenAI with Java. Since there is still limited information on integrating OpenAI with Java, I hope this guide will be useful to you. Next time, I'd like to explore more advanced methods, so stay tuned!
アバター